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Abstract 

Autoimmune  diseases  occur  when  immune  cells  fail  to  develop  or  lose  their  tolerance 
toward  self  and  destroy  body’s  own  tissues.  Both  insufficient  negative  selection  of  self-reac¬ 
tive  T  cells  and  impaired  development  of  regulatory  T  cells  preventing  effector  cell  activation 
are  believed  to  contribute  to  autoimmunity.  Genetic  predispositions  center  around  the  major 
histocompatibility  complex  (MHC)  class  II  loci  involved  in  antigen  presentation,  the  key 
determinant  of  CD4*  T  cell  activation.  Recent  studies  suggested  that  variants  in  the  MHC 
region  also  exhibit  significant  non-additive  interaction  effects.  However,  collective  interac¬ 
tions  involving  large  numbers  of  single  nucleotide  polymorphisms  (SNPs)  contributing  to 
such  effects  are  yet  to  be  characterized.  In  addition,  relatively  little  is  known  about  the  cell- 
type-specificity  of  such  interactions  in  the  context  of  cellular  pathways.  Here,  we  analyzed 
type  1  diabetes  (T 1 D)  and  rheumatoid  arthritis  (RA)  genome-wide  association  data  sets  via 
large-scale,  high-performance  computations  and  inferred  collective  interaction  effects 
involving  MHC  SNPs  using  the  discrete  discriminant  analysis.  Despite  considerable  differ¬ 
ences  in  the  details  of  SNP  interactions  in  T1 D  and  RA  data,  the  enrichment  pattern  of 
interacting  pairs  in  reference  epigenomes  was  remarkably  similar:  statistically  significant 
interactions  were  epigenetically  active  in  cell-type  combinations  connecting  B  cells  to  T  cells 
and  intestinal  epithelial  cells,  with  both  helper  and  regulatory  T  cells  showing  strong  dis¬ 
ease-associated  interactions  with  B  cells.  Our  results  provide  direct  genetic  evidence  point¬ 
ing  to  the  important  roles  B  cells  play  as  antigen-presenting  cells  toward  CD4*  T  cells  in  the 
context  of  central  and  peripheral  tolerance.  In  addition,  they  are  consistent  with  recent 
experimental  studies  suggesting  that  the  repertoire  of  B  cell-specific  self-antigens  in  the  thy¬ 
mus  are  critical  to  the  effective  control  of  corresponding  autoimmune  activation  in  peripheral 
tissues. 
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Introduction 

Autoimmune  diseases  [1],  such  as  type  1  diabetes  (TID)  [2],  rheumatoid  arthritis  (RA)  [3], 
and  multiple  sclerosis  [4],  arise  from  the  inadequate  control  of  immune  cell  reactivity  toward 
self-antigens  and  the  resulting  destruction  of  target  organs.  In  both  TID  and  RA,  genome¬ 
wide  association  studies  (GW AS)  have  revealed  dominant  effects  of  the  major  histocompatibil¬ 
ity  complex  (MHC)  region,  whose  polymorphisms  affect  MHC  class  II  antigen  presentation 
and  recognition  [5-7].  Many  additional  loci,  discovered  from  studies  using  genome-wide 
array  and  Immunochip  designs  [8-10],  reinforce  this  picture  by  revealing  the  disease  associa¬ 
tion  of  numerous  receptors  and  regulators  mediating  such  interactions,  including,  for 
instance,  PTPN22  and  CTLA4  [5,  7]. 

In  TID,  the  autoimmune  action  takes  the  form  of  T  cells  infiltrating  the  pancreas  and 
destroying  insulin-producing  P-ceUs.  Although  the  presence  of  autoantibodies  indicates  that 
humoral  immunity  contributes  to  this  late-stage  pathogenesis  [2, 11, 12],  this  mechanism  also 
depends  on  activation  by  cognate  CDd"^  T  cells.  RA,  characterized  by  inflammations  affecting 
small  joints  of  hands  and  feet,  occurs  when  T  cells,  B  cells,  and  macrophages  enter  the  syno¬ 
vium  and  destroy  local  tissues  [3].  Evidence  suggests  that  the  B  cell  receptor  (BCR) -mediated 
antigen  presentations  by  B  cells  in  the  periphery  are  critical  for  the  activation  of  these  cognate 
CDT"^  T  cells  in  both  TID  [13]  and  RA  [14, 15].  Important  roles  B  cells  play  have  also  been 
established  in  other  autoimmune  diseases  including  systemic  lupus  erythematosus  [16]. 

The  helper  T  cells  (Th)  specific  to  self-antigens  originate  from  the  thymus,  where  the  imma¬ 
ture  T  cell  repertoires  are  first  selected  for  moderate  self-reactivity  (positive  selection)  by  corti¬ 
cal  thymic  epithelial  cells  (cTECs)  [17].  The  subsequent  negative  selection  of  these  cells  in  the 
meduUa  depends  on  the  strength  of  interactions  with  a  range  of  antigen-presenting  cells 
(APCs)  [18],  which  include  medullary  thymic  epithelial  cells  (mTECs)  and  dendritic  cells 
(DCs).  The  mTECs  promiscuously  express  tissue-restricted  antigens  (TRAs),  including  insu¬ 
lin,  promoted  by  the  transcription  factor  AIRE.  These  antigens  are  either  presented  by  mTECs 
themselves  or  “handed-over”  to  DCs  for  presentation  on  MHC  class  11  molecules  toward 
immature  T  cells.  Strongly  reactive  T  cell  subsets  are  subsequently  led  to  apoptosis.  Recent 
studies  suggested  that  in  addition  to  mTECs  and  DCs,  thymic  B  cells  can  also  act  as  APCs  [19], 
expressing  AIRE  and  TRAs  [20].  B  cells  therefore  appear  to  act  as  APCs  both  in  thymic  selec¬ 
tion  and  in  the  peripheral  activation  of  T^  cells,  which  presumably  reflect  the  need  to  train  T 
cell  populations  in  the  thymus  against  the  antigen  repertoire  specific  to  B  cell  presentation  in 
the  periphery  [20]. 

This  clonal  deletion,  however,  is  incomplete,  and  many  T  cells  migrating  into  peripheral  tis¬ 
sues  are  now  known  to  be  self- reactive  even  in  healthy  individuals  [21].  The  deleterious  effects 
of  auto-reactivity  are  kept  in  check  by  the  suppressive  action  of  regulatory  T  cells  (T^eg),  whose 
natural  variant  originates  from  differentiation  of  immature  T  cells  in  the  thymus  [22].  These 
Treg  cells  migrate  into  peripheral  lymphoid  organs  and  suppress  the  activation  of  self-reactive 
effector  cells  [23].  The  current  consensus  is  that  both  negative  selection  (the  likely  fate  of  T 
cells  with  stronger  affinity  to  self-antigens)  and  T^eg  cell  induction  (more  likely  for  those  with 
intermediate  affinity  range)  in  the  thymus  during  the  prenatal  time  period  are  crucial  for  the 
effective  control  of  auto-reactivity  in  peripheral  tissues  [21]. 

Tracking  down  the  exact  cellular  and  molecular  events  in  these  two  aspects  of  tolerance 
(negative  selection  and  T^eg  differentiation)  is  key  to  the  development  of  intervention  and 
treatment  strategies  against  autoimmune  diseases  [  1 1  ] .  It  is  currently  not  clear,  for  instance,  to 
what  extent  different  ceU  types  with  the  capacity  to  act  as  APCs  (mTECs,  DCs,  thymic  and 
peripheral  B  cells)  individually  contribute  to  these  processes.  We  show  in  this  work  that  in 
addition  to  cell  biological  methods  using  transgenic  mice,  analyses  of  genetic  data  can  also 
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shed  light  on  such  issues.  Our  approach  is  based  on  enhanced  inference  of  epistatic  effects 
between  genetic  factors,  which  likely  play  important  roles  in  heritability  components  undetect¬ 
able  from  single-variant  analyses  [24] .  Many  recent  studies  pointed  to  the  importance  of  non¬ 
additive  interaction  effects  in  autoimmune  disease  associations  within  the  dominant  MHC 
loci  and  among  the  classical  HLA  alleles  [25,  26].  Large-scale  collective  interactions  involving 
multiple  MHC  SNPs,  however,  remain  to  be  characterized.  Furthermore,  it  is  of  interest  to 
identify  the  types  of  lymphocytes  or  tissues  in  which  these  interactions  occur,  taking  into 
account  tissue-specific  epigenetic  modifications  of  the  genetic  loci:  many  disease-associated 
single-nucleotide  polymorphisms  (SNPs)  are  in  the  non-coding  region,  presumably  exerting 
their  effects  via  changes  to  gene  regulatory  mechanisms,  which  are  often  ceU-type-specific  [9]. 

In  this  work,  we  tested  the  following  hypotheses  relevant  to  the  cell-type-specific  genetic 
effects  of  autoimmune  risks  discussed  above: 

1.  Non-additive,  collective  interaction  effects  associated  with  disease  status  reflect  cellular 

interaction  mechanisms  underlying  pathogenesis. 

2.  Interrogation  of  the  spatially  resolved  interaction  patterns  allows  for  the  identification  of 

specific  genetic  loci  responsible  for  such  interactions. 

The  first  hypothesis  is  an  extension  of  its  counterpart  implicit  in  standard  non-interacting 
SNP  analyses:  disease-associated  loci  identified  by  independent-SNP  treatments  reveal  the 
effects  of  genes  or  non-coding  regulatory  factors  participating  in  pathogenesis  mechanisms, 
which  can  be  inferred  from  their  ceU-type-specificity.  Analyses  incorporating  explicit  interac¬ 
tion  effects  can  reveal  higher-order  effects,  such  as  interaction  of  two  genetic  factors,  each 
expressed  highly  in  different  cell  types  (e.g.,  T  cells  and  their  APCs).  The  second  hypothesis  is 
an  extension  of  the  fine-mapping  or  conditional  analysis  targeting  a  causal  SNP  or  spatial 
region  within  a  loci  masked  by  linkage  disequilibrium  (LD). 

We  addressed  these  questions  by  using  a  recently  developed  algorithm,  discrete  discrimi¬ 
nant  analysis  (DDA)  [27],  to  infer  non-additive  interaction  effects  involving  a  large  number 
of  SNPs  in  association  with  autoimmune  disease  risks.  The  results  revealed  that  in  addition 
to  exhibiting  extensive  LD,  the  MHC  region  also  possesses  differential  LD  patterns  between 
case  and  control  groups,  which  translate  into  collective  disease  associations  caused  by  a  large 
number  of  SNPs  acting  cooperatively.  We  interpreted  such  collective  effects  on  disease  risks 
in  cell-type/ tissue-specific  manner,  addressing  the  question  of  how  different  APCs  interact 
with  thymic  and  peripheral  CDd"^  T  cells  and  how  disease-associated  genes  interact  within 
the  context  of  T  cell  signaling  and  differentiation.  We  analyzed  two  representative  autoim¬ 
mune  disease  data  sets  (TID  and  RA)  from  the  Wellcome  Trust  Case-Control  Consortium 
(WTCCC)  study  [6]  and  assessed  the  outcomes  by  comparison  with  the  summary  statistics 
of  more  recent  meta-analysis  data  [8, 10].  The  use  of  relatively  smaller  but  well-characterized 
data  sets  facilitated  computational  analysis  necessary  for  DDA,  which  optimizes  all  possible 
simultaneous  interactions  involving  many  SNPs  while  testing  for  significance  of  interacting 
pairs.  We  first  focused  on  the  MHC  region,  where  instead  of  attempting  to  pinpoint  a  few 
SNP  pairs  with  high  significance,  we  extracted  a  large  pool  of  SNP  pairs  with  moderate 
degrees  of  association  and  examined  the  enrichment  of  epigenetically  active  subsets.  We 
found  robust  epigenome-specific  interaction  patterns  common  to  both  TID  and  RA,  which 
suggested  that  B  cells  acting  as  APCs  are  the  main  sources  of  genetic  predispositions  reflected 
in  MHC  loci.  This  observation  supports  experimental  evidence  for  the  role  of  B  cells  as  APCs 
in  general  [20]  and  in  RA  as  shown  by  Taneja  and  co-workers  [15].  We  further  comple¬ 
mented  this  analysis  by  selecting  SNPs  genome-wide  based  on  biological  pathways  for  DDA 
collective  inference  analysis. 
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Results 

Independent-SNP  analysis 

We  first  analyzed  the  WTCCC  TID  genome-wide  data  using  the  independent-SNP  special 
case  of  the  DDA  algorithm.  This  special  case  with  interactions  turned  off  yields  results  closely 
matching  logistic  regression-based  analyses  [27]  and  allows  for  the  identification  of  an  initial 
candidate  set  of  statistically  significant  SNPs.  The  genome-wide  scan  was  consistent  with  the 
original  report  [6]  (Fig  lA),  dominated  by  the  MHC  region  and,  for  TID,  with  additional  sig¬ 
nals  near  PTPN22  on  chromosome  1  and  loci  on  chromosomes  12  and  16.  CTLA4  (strongest 
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Fig  1 .  Genome-wide  association  of  type  1  diabetes  (T1 D)  and  rheumatoid  arthritis  (RA)  single¬ 
nucleotide  polymorphisms  (SNPs)  under  independent-SNP  and  collective  inferences.  (A) 

Independent-SNP  p-value  profiles  based  on  genotypic  model.  Note  that  the  p-values  are  in  “double 
logarithmic”  scale  for  clarity.  (B-C)  Optimization  of  collective  inference  model  parameters  with  cross-validation 
area  under  the  curve  (AUC)  as  a  function  of  model  sizes  (mean  number  of  SNPs  selected)  and  inverse  of 
penalizer  1/A.  The  small  and  large-1/A  limits  correspond  to  non-interacting  and  strongly  interacting  limits, 
respectively.  Vertical  bars  are  95%  c.i.  (D-E)  Statistical  significance  of  the  rise  in  AUC  from  interaction  effects 
(AAUC;  defined  as  the  difference  between  the  maximum  AUC  and  the  non-interacting  limit).  For  T1 D  (D)  and 
RA  (E),  the  first  m  SNPs  were  selected  from  the  sorted  list  with  increasing  order  of  single-SNP  p-value  p„  and 
the  null  distribution  of  AAUC  was  sampled  by  permutation  of  the  phenotypic  label  to  estimate  the  p-value 
{bottom-,  horizontal  line  represents  p  =  0.05)  for  the  significance  of  the  actual  AAUC  observed  (top). 

doi:1 0.1 371/journal. pone.01 69918.g001 
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SNP  rs231726,  independent-SNPp-valuep;  =  5x10“*"),  IL2RA  (rsl0795791,p;  =  4x10“^),  and 
INS  loci  (rs6578252,p;  =  2x10“^)  hadp-values  below  genome-wide  significance.  Thep,-  profile 
for  RA  was  similar  to  TID  but  with  significantly  weaker  association  strengths:  the  bulk  of 
genetic  factors  was  also  explained  by  the  MHC  group,  with  the  second  robustly  identified 
locus  near  PTPN22. 

Cross-validation 

Using  the  quality-controUed  whole-genome  SNP  set,  we  performed  cross-validation  to  assess 
the  inference  performance  with  varying  numbers  of  selected  SNPs  and  determined  the  optimal 
degree  of  interactions  to  be  included  by  varying  the  parameter  X.  The  overall  quality  of  model 
fit  is  represented  by  the  area  under  the  curve  (AUC)  of  the  receiver  operating  characteristics,  a 
measure  ranging  from  1.0  to  -0.5,  with  decreasing  sensitivity  and  specificity.  We  selected 
SNPs  {m  in  total  number  in  each  run)  withp,  below  a  cutoff  The  penalizer  X  values  of  0  and 
00  correspond  to  the  limits  of  strongest  interactions  and  no  interaction,  respectively.  In  Fig  IB 
and  1C,  AUC  is  plotted  as  a  function  of  l/X  such  that  the  non-interacting  limit  is  reached  on 
the  left  and  interaction  strength  increases  with  increasing  l/X. 

For  TID,  we  found  high  AUC  values  (-0.85)  in  the  strongly  interacting  regime  (large  l/X), 
consistent  with  the  general  observation  that  TID  association  is  one  of  the  strongest  among 
common  diseases  [5]  (Fig  IB).  The  non-interacting  limit  showed  an  AUC  range  of  0.75-0.80, 
which  is  similar  to  values  reported  previously  in  the  literature  for  predictive  models  (polygenic 
risk  scoring)  without  interaction  [28].  The  maximum  AUC  values  were  lower  than  but  compa¬ 
rable  to  those  reported  (-0.89)  using  support  vector  machines  [29].  We  also  varied  model  sizes 
(number  of  SNPs  included)  and  observed  lower  AUCs  when  the  mean  SNP  number  increased 
beyond  -100,  which  implies  that  the  model  performance  is  dominated  by  SNPs  with  the  stron¬ 
gest  association  (MHC  region;  Fig  lA)  and  that  including  extra  variants  purely  based  onp,  led 
to  a  net  dilution  of  prediction  performance.  Analogous  cross-validation  for  RA  data  (Fig  1C) 
showed  clear  maxima  near  l/X  -  100,  with  the  maximal  values  rising  moderately  from  0.69 
{pi  <  10“^°;  mean  SNP  number  39)  to  0.70  (p,-  <  10“^;  mean  SNP  number  201)  with  increasing 
model  sizes.  The  absolute  AUC  values  were  lower  than  in  TID,  reflecting  a  weaker  overall 
strength  of  association  under  similar  sample  sizes. 

We  evaluated  the  statistical  significance  of  the  maximum  AUC  values  relative  to  the  non¬ 
interacting  limit  by  considering  it  as  a  statistic  and  sampling  its  null  distribution  with  pheno¬ 
type  label  permutation  (Fig  ID  and  IE).  The  SNP  number  m  was  varied  by  selecting  those 
with  the  lowest  p;.  The  AUC  change  due  to  interaction  (AAUC)  rose,  on  average,  with  increas¬ 
ing  m.  The  corresponding p-values  decreased  with  increasing  m,  becoming  smaller  than  the 
nominalp  =  0.05  for  m  >  10  (TID)  and  m  >  40  (RA). 

Collective  interaction  analysis 

We  then  aimed  to  derive  subsets  of  SNPs  potentially  exerting  significant  interaction  effects  for 
further  analysis  based  on  independent-SNP  p-value  profiles.  For  TID,  selecting  SNPs  using 
pi  <  10“^°  yielded  m  -  100  SNPs  only  from  the  MHC  region.  We  sought  to  include  possible 
interactions  with  non-MHC  loci  by  selecting  50  SNPs  from  the  MHC  region  and  10  each  from 
PTPN22,  CTLA4,INS,IL2RA,  and  12ql3/12q24  (“chr-12”)  loci  [6],  respectively.  We  compiled 
the  lists  of  known  SNPs  [30]  in  strong  LD  (r^  >  0.5)  with  these  “proxy”  SNPs,  which  covered 
PTPN22,  CTLA4,  MHC  class  111  and  II  regions,  and  a  part  of  IL2RA  gene  (SI  Fig);  the  SNP 
group  closest  to  the  INS  gene  (proxy:  rs6578252)  was  -40  kb  downstream  from  it  on  chromo¬ 
some  1  lpl5.5,  partially  explaining  the  lack  of  INS  association  from  the  WTCCC  data.  We  then 
derived  collective  inference  parameters  for  this  SNP  selection  using  the  optimal  penalizer 
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value  {X  =  0.01)  suggested  by  cross-validation  (Fig  IB)  and  tested  the  significance  of  single- 
SNP  and  interaction  parameters  (Fig  2). 

The  single-SNPp-values  shown  in  Fig  2  (middle)  represent  the  significance  of  the  additive 
part  of  the  model  association.  These  values  reduce  to  the  independent-SNP  p,-  (Fig  2,  middle, 
open  bars,  and  Fig  lA)  when  interactions  are  turned  off  [27].  When  inferred  with  interactions, 
these  single-SNP  p-values  in  the  MHC  group  were  significantly  reduced  (Fig  2,  middle,  solid 
bars),  while  many  of  the  non-MHC  SNPs  were  enhanced.  The  SNP  with  the  smallest  single- 
SNP  p-value  (rs9273363,p  ~  10“'^"')  remained  the  same  with  interaction  as  in  the  indepen- 
dent-SNP  case.  This  SNP  is  ~1  kb  upstream  from  HLA-DQBl  gene  and  in  LD  with  rs3830060 
(r^  =  0.517)  located  in  the  coding  region.  The  observation  of  highest  TID  association  with 
SNPs  near  the  HLA-DQBl  gene  is  in  agreement  with  fine-mapping  studies  [26].  The  shift  of 
association  signals  into  interactions  was  drastic  for  some  SNPs:  two  of  them  (rs9268480  and 
rs3763307)  near  BTNL2  gene  showed p,-  ~  10“'*^,  which  shifted  top  ~  10“^  with  interaction. 
Another  example  is  provided  by  four  SNPs  (rs9276429,  rs9276431,  rs9276432,  and  rs2051549) 
in  HLA-DQA2  and  HLA-DQB2  that  showed  shifts  fromp  <  10“"'"^  (non-interacting)  top  ~ 

10"®  (collective).  The  effects  of  these  SNPs  are  expected  to  arise  primarily  from  interactions.  In 
contrast,  many  variants  outside  MHC  showed  enhanced  single-SNP  p-values. 

We  also  noted  relatively  strong  associations  from  four  SNPs  (rs241427,  rs4148882, 
rs241403,  and  rs3101942)  near  TAP1/TAP2  and  PSMB8/PSMB 9.  TAPl/2  encode  protein  sub¬ 
units  transporting  antigen  peptides  into  endoplasmic  reticulum  (ER)  in  the  MHC  class  I  anti¬ 
gen  presentation  pathway  [31],  whereas  PSMB8/9  code  for  immune  cell-specific  versions  of 
the  catalytic  subunits  (psi  and  pii)  of  proteasomes  that  degrade  peptides.  The  proxy  SNP 
rs241427  (and  three  SNPs  in  LD)  resided  inside  TAP2,  whereas  rs4148882,  rs241403, 
rs3101942  aU  had  similar  distributions  of  SNPs  in  LD  that  extended  from  the  3’-end  of  TAP2 
to  the  region  close  to  HLA-DMB  encompassing  PSMB8/9  and  TAPI  (S2  Fig). 

The  changes  in  the  single-SNP  p-values  were  offset  by  extensive  interaction  patterns,  nota¬ 
bly  within  the  MHC  region,  reflected  in  the  interaction  p- value  landscape  (Fig  2,  top).  One  of 
the  key  sub-regions  was  near  HLA-DQAl  and  HLA-DQBl  (rs9272219  and  rs9273363),  whose 
protein  products  form  the  MHC  class  II  heterodimer  on  the  surface  of  APCs  that  display  anti¬ 
gens  toward  the  T  cell  receptor  (TCR).  HLA-DQBl  encodes  the  DQ  P-chain  that  interacts 
directly  with  TCR,  and  we  deduce  that  a  significant  part  of  its  dominant  role  in  TID  associa¬ 
tion  stems  from  interactions.  We  compared  these  interaction  patterns  with  LD  (S3  Fig). 
Interactions  between  SNPs  inferred  from  DDA  manifest  themselves  as  differences  in  allele-fre¬ 
quency  correlations  between  case  and  control  groups:  if  LD  patterns  within  case  and  control 
groups  were  identical,  the  difference  would  vanish  even  when  individual  correlations  are  large. 
For  instance,  although  there  was  a  strong  LD  within  the  CTLA4  and  IL2RA  loci  in  S3  Fig,  the 
corresponding  interactions  were  largely  absent  (Fig  2). 

For  RA,  we  used  a  different  SNP  selection  method  to  examine  whether  strong  LD  in  the 
MHC  region  affected  the  outcome:  we  first  selected  a  larger  number  (m  =  255)  of  SNPs  and 
used  clustering  to  filter  them  into  a  selection  of  m  =  70  SNPs.  Compared  to  TID,  this  RA  selec¬ 
tion  had  significantly  reduced  redundancy  of  SNPs  with  similar  LD  patterns,  and  included 
~20  SNPs  in  the  MHC  class  I  region  (Fig  3).  We  inferred  additive  single-SNP  and  interaction 
p-values  of  the  selected  model:  in  the  non-interacting  case  (Fig  3,  middle,  open  bars),  strongest 
association  was  observed  within  the  selection  at  rs6457620  (HLA-DQBl;pi  =  3x10”^^), 
rs9268560  (HLA-DRA; pi  -  5x10“*’'’),  and  rs2076533  {BTNL2;pi  -  3x10“'’°).  In  comparison, 
when  interactions  were  included,  single-SNP  p-values  were  more  uniformly  distributed  (Fig  3, 
middle,  solid  bars).  SNP  pairs  with  low  interaction  p-values  were  less  concentrated  in  the 
MHC  class  II  region  than  in  TID  (Fig  2),  reflecting  the  smaller  number  of  SNPs  in  high  LD. 
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Fig  2.  T1 D  collective  inference  p-values  and  epigenetic  activity  distribution.  SNPs  (“proxies”)  were  seiected  based  on  their 
independent-SNPo  vaiues  {middle,  open  bars).  The  top  (trianguiar)  and  m/dd/e  (bar  graph)  paneis  show  the  interaction  and 
(additive)  singie-SNP  contributions,  respectiveiy.  Bottom  panei  shows  the  mean  frequencies  (averaged  over  groups  of  aii  known 
SNPs  in  high  LD  to  each  proxy)  of  epigeneticaiiy  active  states  within  the  1 1 1  Roadmap  reference  epigenomes. 

cloi:1 0.1 371/journal. pone.01 6991 8.g002 
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Fig  3.  RA  collective  inference  p-values  and  epigenetic  activity  distribution,  m  =  70  proxy  SNPs  were  selected 
based  on  p,  values  {middle,  open  bars)  combined  with  clustering  to  reduce  LD.  The  top  (triangular)  and  middle  (bar 
graph)  panels  show  the  interaction  and  (additive)  single-SNP  contributions,  respectively.  Bottom  panel  shows  the  mean 
frequencies  of  epigenetically  active  states  as  in  Fig  2. 

doi:1 0.1 371/journal. pone.01 6991 8.g003 
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Cell-type-specific  effects  of  variants 

The  results  described  above  allow  us  to  test  our  first  hypothesis  regarding  the  ceU-type-speci- 
ficity  of  disease-associated  SNP  interactions.  For  this  purpose,  we  superimposed  the  single- 
SNP  and  interaction  maps  on  the  epigenetic-state  annotations  of  the  corresponding  genomic 
locations  in  cellular  contexts  [27].  We  used  reference  epigenomes  from  the  National  Institutes 
of  Health  Roadmap  Epigenomics  data  [32],  classifying  their  genomic  annotations  into  active 
(transcribed  or  enhancer)  and  inactive  states.  These  annotations,  inferred  from  machine  learn¬ 
ing  of  chromatin  modification  and  DNA  accessibility  data,  should  be  regarded  only  as  first 
approximations  of  the  actual  activity  of  genomic  loci.  We  also  accounted  for  the  fact  that  our 
proxy  SNPs  likely  reflect  true  causal  SNPs  indirectly  via  LD  by  averaging  the  active  state  fre¬ 
quencies  over  the  list  of  aU  known  SNPs  in  LD  (SI  and  S4  Figs).  The  resulting  map  (Fig  2,  bot¬ 
tom-,  see  S5  Fig  for  epigenome  codes  of  each  cell  type  [32])  suggested  that  a  significant  part  of 
the  MHC  SNPs  (class  II  region  including  HLA-DR  and  HLA-DQ)  were  highly  specific,  active 
largely  in  immune  cells  and  digestive  tissues  (fetal  stomach/intestine,  colon,  rectal/stomach/ 
duodenum  mucosa;  S5  Fig)  only.  In  contrast,  the  class  III  region,  PTPN22,  and  chr-12  loci 
were  uniformly  active.  The  SNPs  in  CTLA4  were  active  only  in  T  cells,  whereas  IL2RA  variants 
were  active  in  both  T  and  B  cell  groups,  the  latter  including  monocytes  (progenitors  of  DCs) 
and  natural  killer  cells  (S5  Fig).  The  RA  map  (Fig  3,  bottom)  included  SNPs  near  the  HLA-DP 
genes,  which  were  more  broadly  active  than  the  other  class  II  loci.  Overall,  the  epigenetic  activ¬ 
ity  distributions  suggested  that  the  dominant  part  of  single-SNP  contributions  from  the  MHC 
class  II  region  mostly  involve  the  action  of  B  and  T  cells  and  a  subset  of  digestive  mucosal  cells. 

To  infer  quantitative  measures  of  ceU-type-specific  association,  we  then  performed  enrich¬ 
ment  analyses  of  active  SNPs  and  their  interactions.  We  first  calculated  the  enrichmentp- 
values  for  the  over-representation  of  active  SNPs  among  individual  proxy  SNP  sets  within  dif¬ 
ferent  epigenomes  (Fig  4A)  and  found  strong  enrichment  in  T  cells  and  B  cells:  for  TID,  the 
highest  enrichment  was  with  T^eg  cells  (E044)  and  B  cells  (E032)  in  these  two  groups.  The  over¬ 
all  enrichment  was  stronger  with  TID  than  RA  (geometric  means  ofp-values  over  T  and  B  cell 
groups  were  1.1x10”®  and  0.031  for  TID  and  RA,  respectively).  This  comparison  exemplified  a 
general  trend  we  observed  (from  Figs  1-3):  significantly  higher  effect  sizes  and  power  in  TID 
data  than  in  RA,  but  broad  qualitative  similarities  across  the  two  diseases.  The  single-SNP 
enrichment  distribution  also  included  marginal  signatures  in  the  thymus,  a  number  of  diges¬ 
tive  tissues  (E106:  sigmoid  colon,  ElOl:  rectal  mucosa,  and  E077:  duodenum  mucosa),  in  addi¬ 
tion  to  the  lung  (E096)  and  spleen  (El  13). 

We  then  sought  to  delineate  how  interacting  SNP  pairs  (Figs  2  and  3)  were  represented  in 
different  cell  type  combinations.  The  enrichment  p -value  landscapes  in  Fig  4B  signify  the  over¬ 
representation  of  active  state-active  state  pairs  among  interacting  SNP  pairs  in  different  cell 
type  combinations.  The  most  prominent  were  those  with  peripheral  blood  B  cells  (E032): 

SNPs  active  in  B  cells  interacted  with  those  active  in  a  wide  range  of  T  cells,  hematopoietic 
stem  cells,  and  digestive  tissues,  in  addition  to  the  lung  and  spleen.  The  B  cells  from  cord 
blood  showed  similar  patterns  with  weaker  effects.  Despite  considerable  differences  in  SNP 
selection  and  the  resulting  interaction  p-value  distributions  between  TID  (Fig  2)  and  RA  (Fig 
3),  their  enrichment  patterns  (Fig  4B)  were  strikingly  similar,  suggesting  a  common  underly¬ 
ing  mechanism  of  autoimmune  responses. 

The  repertoire  of  cell  types  with  enriched  interaction  reflected  the  lineage  of  lymphocytes 
involved:  hematopoietic  stem  cells  (E035,  E051,  E050,  and  E036)  differentiate  into  immune 
cells,  including  B  cells  (E032  and  E031),  T  cells  (E034)  in  the  thymus  (E093),  and  monocytes 
(E029),  the  progenitors  to  DCs.  The  fetal  thymus  (E093)  featured  weakly  in  the  TID  landscape, 
interacting  with  B  cells,  whereas  the  adult  thymus  (El  12)  did  not.  This  observation  is 


PLOS  ONE  I  DOI:10.1371/journal.pone.0169918  January  12,  2017 


9/27 


t'PLOS 


ONE 


Genetic  Interaction  Effects  in  Autoimmunity 


10 

8  - 
a 

o  6  - 
O) 

O  4 
I 

2 

0 


T1D 


•■I.. 


f1>‘  ■ 


^  cP 


ll■■■.■■.l■■.-l„ll■— ■.■m.I.Ih.-IIIhiIj.iiIiIIiIiIi.i.mImI 

I  ’?=§§§?S®ssB§^g|i§?&iP°&|ii§i 


cT 


4  n 

o  ^ 

I  2 

I 

1  - 
0 


#V 


iilili _ □._IIIJ 


^  CgV 


dS-” 


li... _ l.._l  II... _ In.  I.■_l.■..l■llllllllll _ ...ill 


APC  T  cell 


Fig  4.  Cell  type-specific  enrichment  of  SNPs  and  their  interactions.  (A)  Single-SNP  enrichment  (over¬ 
representation  p-values)  of  T1 D  and  RA-associated  proxy  SNPs  (Fig  2)  in  reference  epigenomes.  (B)  Enrichment 
of  statistically  significant  interaction  pairs  active  in  cell  type  combinations.  Combinations  with  negligible 
enrichment  are  not  shown  for  clarity.  Note  the  dominance  of  B  cells  in  their  interaction  to  T  cells,  monocytes,  and 
intestinal  epithelial  cells.  See  S5  Fig  for  the  full  epigenome  names.  (C)  Interrelationships  of  genefic  factors 
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associated  with  autoimmunity  in  the  context  of  antigen  presenting  ceii  (ARC)  versus  T  ceii  interaction.  Thymic 
ARCS  are  stimuiated  by  the  epidermai  growth  factor  receptor  (EGFR)  and  interferon  (IFN)-y  signaiing  pathways  to 
express  major  histocompatibiiity  (MFIC)  ciass  II  molecules,  which  are  assembled  in  the  endoplasmic  reticulum 
(ER)  and  loaded  by  AIRE-induced  self-peptides  in  the  endosomal  MFIC  class  II  compartment  (MIIC).  The  peptide- 
MFIC  complex  is  recognized  by  the  T  cell  receptor  (TCR),  initiating  its  downstream  signaling  leading  to  the 
activation  of  NF-kB,  which  enters  the  nucleus  and  up-regulates  FOXP3  (in  Tregs),  interleukin  (IL)-2  receptors,  and 
IL-2  (in  conventional  T  cells).  IL-2  signaling  is  crucial  to  both  conventional  T  cell  proliferation  and  T^eg  cell 
activation.  Genes  shown  in  red  are  those  implicated  by  the  top-ranked  pathways  (Fig  6). 

doi:1 0.1 371/journal. pone.01 6991 8.g004 

consistent  with  the  fact  that  the  hulk  of  T  cell  maturation  in  the  thymus  occurs  during  the  pre¬ 
natal  phase,  and  fetal  and  adult  thymi  secrete  distinct  lineages  of  T  cells  [33].  Among  the  three 
different  classes  of  APCs,  mTECs  and  DCs  are  expected  to  be  reflected  in  the  fetal  thymus 
(E093)  and  monocytes  (E029)  epigenomes,  respectively.  We  found  that  T  cell  interactions  with 
monocytes  and  B  cells  were  significantly  enriched  but  the  corresponding  interaction  with  fetal 
thymus  was  not  (Table  1).  B  cells  from  peripheral  blood  showed  stronger  enrichment  than 
cord  blood;  Yamano  et  al.  recently  reported  that  peripheral  B  cells  immigrate  into  the  thymus 
and  are  transformed  into  thymic  B  cells  by  CD40  licensing  signals  from  T  cells  [20].  T 
cells  that  strongly  interacted  with  DCs  and  B  cells  comprised  naive  Th  cells  (E038)  and  T^eg 
cells,  which  is  consistent  with  the  view  that  both  clonal  deletion  and  T^g  induction  in  the  thy¬ 
mus  are  crucial  for  the  control  of  autoimmunity  [11]. 

Based  on  the  cell-type-specific  interaction  results  in  Table  1,  we  inferred  the  relative  impor¬ 
tance  of  different  APC  repertoire  in  autoimmunity  (Fig  5)  under  the  assumption  that  the  epi¬ 
genomes  of  naive  Th  cells,  fetal  thymus,  monocytes,  and  umbilical  cord  blood  B  cells  are 
reasonable  approximations  of  those  for  T  cells  in  the  thymus  and  periphery,  mTECs,  DCs,  and 
thymic  B  cells,  respectively.  The  sequence  of  significant  enrichment  patterns  in  Fig  5  indicates 
that  direct  interaction  of  T  cells  with  thymus  tissues  is  negligible,  DCs  presenting  mTEC- 
derived  self-antigens  interact  less  strongly  than  B  cells,  and  the  interactions  with  peripheral  B 
cells  are  strongest.  The  latter  likely  reflects  the  activation  of  cognate  Th  cells  that  escaped  the 
negative  selection  from  the  DC/B  cell  recognition  in  the  thymus. 

The  interaction  patterns  shown  in  Fig  4B  (TID)  additionally  pointed  to  the  interplay 
between  lymphocytes  and  intestinal  tissues.  In  particular,  the  interactions  between  B  cells 
(E032/E031)  and  mucosal  epigenomes  (ElOl:  rectum,  EllO:  stomach,  E077:  duodenum,  and 
E096:  lung)  reflect  contributions  from  the  MHC  class  111  region  near  the  BTNL2  gene.  BTNL2 
belongs  to  the  butyrophilin  family  of  co-stimulatory  proteins  homologous  to  the  B7  family 
(CD80  and  CD86),  the  binding  partner  of  CD28  (stimulatory)  and  CTLA4  (inhibitory)  recep¬ 
tors  on  T  cells  [34].  BTNL2  is  highly  expressed  in  lymphocytes  and  intestinal  epithelium  cells 
(see  Fig  2,  bottom)  and  its  simulation  can  inhibit  T  cell  activation,  in  addition  to  inducing  T^eg 
cell  development  (Fig  4C).  Evidence  for  interactions  between  BTNL2  and  B  cells  has  also  been 
reported  [34].  One  of  the  two  TID  proxy  SNPs  (rs9268480)  inside  the  coding  region  of  BTNL2 


Table  1 .  Enrichment  p-values  of  antigen-presenting  ceil  (APC)  versus  T  cell  interactions  associated  with  type  1  diabetes. 


APCs 

T  cells  (E034) 

Th  naive  (E038) 

Th  memory  (E040) 

Trcg(E044) 

Fetal  thymus  (E093) 

0.55 

0.81 

0.50 

0.22 

Monocytes  (E029) 

4.9  X  10“'' 

0.016 

2.7  X  10“'' 

4.2  X  10“® 

B  cells,  PB  (E032) 

8.4  X  10“'' 

1.4  X  10“® 

4.4  X  10“" 

2.7  X  10“'" 

B  cells,  CB(E031) 

1.8  X  10“® 

2.9  X  10“^^ 

1.3  X  10“" 

7.5  X  10“® 

The  p-values  signify  the  over-representation  of  interacting  single-nucleotide  polymorphism  pairs  active  in  each  cell  type  combination.  See  Fig  4B.  Shown  in 
parentheses  are  the  epigenome  codes.  RB,  peripheral  blood;  CB,  cord  blood. 
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Fig  5.  Sequence  of  relative  importance  of  APC  repertoire  in  T  ceil  selection  based  on  epigenomic 
enrichment.  The  p-values  indicated  (Tabie  1 )  refer  to  those  of  interactions  with  naive  for  fetai  thymus, 
monocytes,  B  ceiis  (cord  biood),  and  B  ceiis  (peripherai  biood),  respectiveiy. 
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was  28  bp  upstream  from  rs2076530,  previously  shown  to  cause  splicing-related  disruption  of 
BTNL2  and  an  increased  risk  for  sarcoidosis  [35]. 

Genetic  interaction  map 

The  interactions  shown  in  Fig  4B  count  all  contributions  from  SNP  pairs  active  in  each  tissue 
combination.  To  test  our  second  hypothesis  regarding  the  spatial  resolution  of  disease-associ¬ 
ated  interaction  effects,  we  considered  contributions  from  different  genomic  loci  and  gener¬ 
ated  histograms  of  significant  SNP  pairs  active  in  specific  cell  types  as  functions  of  their 
genomic  positions  (Fig  6,  left).  For  this  analysis,  we  used  TID  data  that  had  higher  effect  sizes 
than  RA.  Non-MHC  loci  had  only  a  few  signatures  of  interactions  with  marginal  significance 
(smallest p  ~  1x10“^)  between  CTLA4,HLA-DR,  HLA-DQ,  IL2RA,  and/NS  loci  (SI  Table).  We 
focused  on  the  MHC  region  in  the  subsequent  analysis  of  genetic  mapping.  Since  the  frequen¬ 
cies  of  epigenetically  active  states  at  different  genomic  positions  vary  depending  on  tissues,  we 
also  obtained  spatially  resolved  enrichment  p-values  (Fig  6,  right)  together  with  the  pair  distri¬ 
butions.  The  spatial  resolution  for  the  enrichment  test  was  limited  by  the  need  to  have  suffi¬ 
cient  numbers  of  interacting  SNP  pairs  in  each  grid  point.  In  this  sense,  the  absolute  (effective) 
pair  number  (/e/f  column)  and  enrichment  p-values  {right  column)  in  Fig  6  each  provide  infor¬ 
mation  regarding  the  genomic  positions  contributing  to  the  interaction  and  their  statistical 
significance,  respectively.  Three  representative  combinations  of  epigenomes  are  shown  in  Fig 
6,  with  additional  three  in  S6  Fig. 

Globally,  we  grouped  interactions  into  those  within  and  between  MHC  class  I  (31-31.5 
Mb),  class  III  (31.5-32.1  Mb),  and  class  II  (32.1-33  Mb)  regions.  We  first  examined  interac¬ 
tions  within  peripheral  blood  B  cells  (Fig  6A),  which  were  dominant  in  Fig  4B.  Broadly  distrib¬ 
uted  interactions  within  and  between  HLA-DQB1,HLA-DRB1,  and  HLA-DRA  genes  were 
most  notable.  In  professional  APCs  (B  cells  and  DCs),  the  protein  products  of  HLA-DRA, 
HLA-DRBl,  HLA-DQAl,  and  HLA-DQBl  form  the  MHC  class  II  molecule  in  the  ER,  bind 
antigenic  peptides,  and  present  them  on  the  cell  surface  to  T  cells  for  recognition  [31] 
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Fig  6.  Spatial  distribution  of  tissue-specific  SNP-interactions  for  T1 D.  The  overall  enrichment  profiles  of 
cell  type  combinations  in  Fig  4B  were  resolved  into  contributions  from  genetic  regions  within  the  MHC  locus. 
(A)  B  cell:B  cell,  (B)  B  cell:T  cell,  and  (C)  Duodenum  mucosa:Treg  cell  combinations.  The  left  and  right 
columns  show  the  effective  number  of  active  SNP  pairs  (in  a  20-kb  grid)  and  the  relative  enrichment  of  this 
number  within  the  given  cell  type  combination  against  the  total  (in  a  1 00-kb  grid),  respectively. 
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(Fig  4C).  The  strong  interactions  within  and  between  HLA-DQ  and  DR  loci  in  B  cells  (Fig  6A) 
are  consistent  with  this  core  component  (antigen-presentation  in  thymic  APCs)  of  TID  phe¬ 
notype,  and  supports  our  earlier  conclusion  that  thymic  B  cells  are  among  the  major  compo¬ 
nents  of  the  APC  repertoire.  The  HLA-DM  gene  encodes  a  chaperone  that  facilitates  the 
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loading  of  peptides  in  the  late  endosome  within  this  antigen  presentation  pathway  [31,  36]  (Fig 
4C);  hence,  the  interaction  of  the  DM-proximal  region  with  HLA-DQ  and  DR. 

We  also  found  evidence  of  interactions  between  HLA-DQA2  and  HLA-DQBl  in  B  cells  (Fig 
6A).  These  two  genes  are  paralogs  of  HLA-DQAl  and  HLA-DQBl,  and  are  mostly  silent, 
except  in  Langerhans  cells  [37].  Genetic  associations  of  SNPs  in  the  HLA-DQA2/B2  locus  have 
been  observed  in  many  other  diseases.  We  noted  above  that  the  additive  part  of  the  association 
strength  of  HLA-DQA2/B2  SNPs  was  reduced  significantly  in  collective  inference.  The  pres¬ 
ence  of  interactions  involving  this  locus  shown  in  Fig  6A  supports  the  notion  that  the  effects  of 
these  genes  on  disease  mechanism  are  cooperative  in  nature.  We  examined  the  epigenetic 
states  of  this  segment  of  MHC  class  11  locus  (S5  Fig)  and  found  them  to  be  silent  in  most  tis¬ 
sues,  but  HLA-DQB2  was  weakly  transcribed  in  monocytes  (E029)  and  B  cells  (E032),  which  is 
broadly  consistent  with  the  reported  expression  in  Langerhans  cells  [37].  HLA-DQB2  also  con¬ 
tained  a  small  (-0.4  kb)  enhancer  segment  in  the  pancreas  (E098).  Importantly,  HLA-DQA2 
contained  a  segment  (-1  kb)  of  enhancers  in  B  cells  (E032)  and  hematopoietic  stem  cells  (E051 
and  E036).  We  suggest  that  the  interaction  between  HLA-DQA2  and  HLA-DQBl  (Fig  6)  indi¬ 
cates  that  the  target  genes  of  the  B  cell  enhancer  region  in  HLA-DQA2  are  likely  HLA-DQAl/ 
B1  and  HLA-DR. 

Within  the  class  Ill  region,  we  observed  significant  interactions  between  the  regions  close 
to  BAG6  and  HLA-DQBl  as  well  as  HLA-DRA  genes.  BAG6  encodes  a  chaperone  with  a  wide 
range  of  functions  [38],  one  of  which  is  the  co-regulation  of  MHC  class  II  expression  in  APCs 
together  with  CIITA  [39].  Both  CIITA  and  BAG6  are  activated  in  response  to  interferon 
(IFN)-y,  which  as  a  result  stimulates  APC  function  (Fig  4C).  The  BAG6-proximal  region  of  the 
MHC  class  III  locus  was  broadly  expressed  in  all  cell  types  (Fig  2).  Its  interaction  with  HLA- 
DQBl  gene  in  Fig  6A  is  consistent  with  this  MHC  class  II  regulatory  function  of  BAG6.  In  the 
class  I  region,  there  were  signs  of  interactions  within  a  region  close  to  MICB  gene  and  between 
MICB  and  HLA-DR/DQ  loci. 

We  examined  analogous  interaction  patterns  between  SNPs  active  in  T  cells  and  those 
active  in  B  cells  (Fig  6B).  Some  of  the  interactions  observed  in  B  cell  self-interactions  (Fig  6A) 
were  weaker  or  absent  in  this  combination.  The  presence  of  HLA-DQBl  local  interactions  in 
Fig  6B  likely  reflects  the  fact  that  this  region  is  also  epigeneticaUy  active  in  T  cells  (Fig  2).  Inter¬ 
actions  involving  HLA-DQA2  and  HLA-DRA  from  T  cells  were  absent.  On  the  other  hand,  the 
presence  of  interactions  involving  BAG6-proximal  regions  suggest  a  causal  relationship:  in 
addition  to  its  regulatory  function  toward  MHC  class  II  molecules,  BAG6  is  known  to  act  as  a 
negative  regulator  of  HAVCR2,  which  is  expressed  in  exhausted  T  cells  along  with  CTLA4  [40] 
(Fig  4C).  BAG6  rescues  Th  cells  from  exhaustion  and  its  overexpression  can  lead  to  increased 
risk  for  autoimmune  diseases  in  mice  [41]. 

Most  notable  in  Fig  6B  was  the  presence  of  interaction  between  the  region  close  to  PSMB8/ 
9  in  T  cells  and  the  HLA-DQ  locus  in  B  cells  (also  present  in  B  cell  self-interactions  in  Fig  6A). 
PSMB8/9  are  located  close  to  TAPl/2  (S2  Fig)  and  Fig  6B  per  se  does  not  make  it  clear  which 
gene  groups  are  causal.  They  each  encode  catalytic  subunits  of  immunoproeasomes  and  ER 
peptide  transporter  complex,  respectively,  which  together  form  an  integral  part  of  MHC  class  I 
antigen  presentation  pathway  [31],  whose  relevance  to  autoimmunity,  however,  is  likely  to  be 
secondary.  Proteasomes  also  process  key  transcription  factors,  notably  NE-kB  [42],  upon  TCR 
activation,  which  allows  for  the  expression  of  interleukin  (IL)-2  and  regulators  such  as  EOXP3. 
In  resting  T  cells,  NF-kB  remains  in  an  inactive  form  as  a  part  of  inhibitors  of  kB  kinase  (IKK) 
complexes.  When  the  peptide-MHC  class  II  signal  from  APCs  arrives  via  phosphorylation  cas¬ 
cades,  it  allows  for  the  polyubiquitination  of  the  IKK  complex.  Degradation  by  the  proteasome 
then  releases  the  active  NF-kB  dimer,  which  enters  the  nucleus  and  binds  to  DNA  to  express 
key  regulators  leading  to  T  cell  proliferation  and  development  (Fig  4C);  IL-2  secretion  and  its 
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recognition  by  IL-2  receptors  encoded  by  IL2RA/RB  are  indispensable  to  T  cell  proliferation  in 
general.  FOXP3  is  the  distinguishing  marker  for  T^eg  cells,  which,  however,  do  not  express  IL- 
2  and  rely  on  other  T  cells  for  IL-2.  A  deficiency  in  psi  encoded  by  PSMB8  can  lead  to 
autoimmunity  [43]  and  affect  the  T  cell  lineage  differentiation. 

We  therefore  suggest  that  the  interactions  shown  in  Fig  6B  reflect  those  between  the  poly¬ 
morphisms  in  MHC  class  II  molecules  expressed  in  thymic  B  cells  (and  other  APCs)  and  the 
immunoproteasome  variants  that  affect  TCR  signaling.  The  two  effects  would  each  modify 
TCR  activation  signal  strengths  and  NF-kB  activation  levels,  which  would  in  turn  affect  IL-2 
and  FOXP3  expression  levels  in  negative  selection  and  T^eg  ceU  differentiation,  respectively. 
Consistent  with  this  interpretation,  PSMB8/9  versus  HLA-DQ  interactions  in  Fig  6B  were  sig¬ 
nificantly  stronger  when  the  two  loci  each  belonged  to  T  and  B  cells,  respectively,  compared  to 
the  B  and  T  ceU  combination. 

We  also  compared  the  interaction  patterns  between  SNPs  active  in  duodenum  mucosa 
(representative  of  intestinal  epithelial  ceUs)  and  T^eg  ceUs  (Fig  6C),  and  confirmed  our  assertion 
that  intestinal  tissue  interactions  arose  from  BTNL2  expressed  in  mucosal  tissues,  which  can 
lead  to  Treg  cell  development  [34]  (Fig  4C).  Interestingly,  the  main  interaction  partner  of 
epithelial  BTNL2  was  the  BAG6  region  in  T^eg  cells,  which  suggests  that  BAG6  may  also  be 
involved  in  this  process  in  addition  to  its  interaction  with  HAVCR2.  We  found  similar  interac¬ 
tion  signatures  in  the  duodenum  versus  the  B  ceU  combination  (S6  Fig),  with  stronger  interac¬ 
tions  between  BTNL2  and  HLA-DQBl. 


Collective  inference  with  pathway-based  SNP  selections 

To  complement  the  analysis  of  interactions  between  the  MHC  loci,  we  selected  genome-wide 
SNPs  based  on  the  Reactome  pathways  [44]  and  evaluated  each  SNP  selection  for  association 
with  autoimmunity.  Our  approach  differs  from  other  enrichment-based  pathway  analysis 
methods  for  SNPs  [45]  because  all  possible  interaction  effects  between  SNPs  contained  in  each 
pathway  are  taken  into  account  [27].  For  TID,  the  mean  number  of  SNPs  selected  in  each 
pathway  ranged  up  to  several  thousands  (excluding  generic  pathways  at  the  top  hierarchical 
level)  and  AUC  scores  up  to  -0.80  (S7  Fig).  Pathways  containing  genes  within  the  MHC  loci, 
including  PSMB  genes  and  NOTCH4,  were  highly  scored,  and  most  of  non-immune  pathways 
in  TID  data  with  AUC  >  0.6  contained  PSMB  genes.  We  therefore  restricted  our  attention  to 
the  pathways  belonging  to  Immune  system  group,  aiming  to  gain  insights  into  the  effect  of 
interactions  involving  MHC  and  other  genetic  loci.  The  13  top-ranked  pathways  (AUC  -0.80 
and  -0.68  for  TID  and  RA,  respectively)  were  clearly  separated  from  the  rest  and  all  consisted 
of  MHC  class  Il-related  pathways  (Fig  7  and  S7  Fig).  The  analogous  inference  without  interac¬ 
tion  led  to  significantly  lower  AUC  values  (-0.75  and  -0.65  for  TID  and  RA,  respectively; 
S7Fig). 

In  Fig  7,  the  high  association  of  MHC  class  II  antigen  presentation,  represented  by  the  class 
II  genes  including  HLA-DM  and  HLA-DO,  reflects  the  general  dominance  of  HLA-DQ  and 
HLA-DR  loci  in  Fig  2:  MHC  class  II  molecules  assembled  in  the  ER  with  the  invariant  chain 
(li),  initially  occupying  the  peptide  binding  domain,  moves  into  an  endosomal  compartment 
and  li  is  exchanged  with  antigenic  peptides  with  the  help  of  HLA-DM  and  HLA-DO,  which 
down-regulates  HLA-DM  in  B  cells  [31,  36,  46].  The  peptide-MHC  class  II  complex  is  then 
displayed  on  the  plasma  membrane  (Fig  4C).  The  subsequent  recognition  of  the  peptide-MHC 
class  II  complex  by  the  TCR  on  CDd"^  T  cells  and  the  resulting  activation  of  T  cells,  represented 
by  the  TCR  signaling  pathways,  showed  similarly  high  association.  The  proximal  part  of  TCR 
signaling  after  the  MHC  class  II  recognition  (Fig  4C)  starts  with  the  recruitment  of  lympho¬ 
cyte-specific  protein  tyrosine  kinase  (LCK)  by  CD4,  which  phosphorylates  TCR  complex  and 
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Fig  7.  Immune  system  pathways  scored  by  collective  inference.  The  pathways  belonging  to  the  Immune 
system  pathway \n  Reactome  hierarchy  in  association  with  (A)  T1D  with  AUC  >  0.65  and  (B)  RA  with 
AUC  >  0.57  are  shown.  The  dendrograms  below  the  bars  represent  the  relative  hierarchical  relationships.  Error 
bars  are  95%  c.i.  Adv.,  advanced;  DAP12,  DNAX  activation  protein  of  12kDa;  DHX,  aspartate-glutamate-any 
amino  acid  aspartate/histidine  (DExD/H)  box  helicase;  ER,  endoplasmic  reticulum;  exog.  exogenous; 
immunoreg.,  immunoregulatory;  IFN,  interferon;  inter.,  interation;  PD-1 ,  programmed  cell  death  protein  1 ; 
present.,  presentation;  RIP,  receptor-interacting  protein;  sol.,  soluble;  TCR,  T  cell  receptor;  ZBP1 ,  Z-DNA 
binding  protein  1 . 
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recruits  ^-chain  associated  protein  kinase  of  70kDa  (ZAP-70)  [47],  PTPN22  downregulates 
this  LCK  action  and  thus  TCR  signaling,  and  its  inhibition  can  lead  to  increased  risks  for  auto¬ 
immunity  [5].  These  steps  are  described  hy  Phosphorylation  ofCD3  and  TCR  zeta  chains  and 
Translocation  of  ZAP-70  to  immunological  synapse,  to  which  both  MHC  class  II  genes  and 
PTPN22  contribute.  ZAP-70  leads  to  the  formation  of  linker  for  activation  of  T  cells  (LAT) 
complex  comprising  a  host  of  binding  partners  that  include  adhesion  and  degranulation-pro¬ 
moting  adaptor  protein  (ADAP)  and  phospholipase  C  (PLC)-y.  The  LAT  complex  leads  to 
three  major  downstream  pathways:  the  Ca^"^,  the  mitogen-activated  protein  kinase  (MAPK) 
kinase  and  the  NF-kB  signaling  pathways,  in  addition  to  the  “inside-out”  signaling  that 
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activates  integrins  via  actin  cytoskeleton  reorganization  (Fig  4C).  The  inside-out  pathway  is 
mediated  by  the  binding  of  ADAP  to  Drosophila  enabled  (Ena)/vasodilator-stimulated  phos- 
phoprotein  (VASP),  a  regulator  of  actin  polymerization  [48],  and  results  in  the  formation  of 
activation  clusters  centered  around  peptide-MHC-TCR  complexes  allowing  for  enhanced 
interactions  between  APCs  and  T  cells  [49].  In  addition  to  MHC  class  II  genes,  the  pathway 
Generation  of  second  messenger  molecules  was  implicated  hjENAH  that  encodes  the  Ena  pro¬ 
tein.  ENAH  polymorphisms  may  thus  affect  the  integrin-mediated  APC-T  cell  adhesion  com¬ 
ponent  of  the  TCR  activation  process:  there  were  six  associated  SNPs  with  pi  <  10"^  in  the 
coding  region  of  ENAH  (S8  Fig)  in  TID  with  the  strongest  association  by  rs2639703  {pi  = 
1.2x10“^). 

Downstream  TCR  signaling  represents  the  process  leading  to  the  activation  of  transcription 
factor  NF-kB,  which  enters  the  nucleus  and  allows  for  the  expression  of  many  regulators  and 
cytokines,  including  IL-2  and  FOXP3  (Fig  4C),  thus  playing  critical  roles  in  T^eg  cell  develop¬ 
ment  [42].  This  pathway  contains  NEKBl,  CDC34,  PSMB8/9,  and  PSMD14,  in  addition  to 
MHC  class  II.  NPKBl  encodes  the  pI05  subunit  of  the  NF-kB  protein  family,  which  undergoes 
proteasomal  processing  upon  TCR  activation  and  combines  with  a  p65  subunit  to  form  the 
active  NF-kB  heterodimer  [42]  (Fig  4C).  The  association  of  NPKBl  indicates  that  the  NF- 
kB  downstream  pathway  is  central  to  autoimmunity  [  1  ]  and  the  gene  was  represented  by 
rsl314336  (p,  =  6.0x10“"^,  44  kb  upstream;  S8  Fig).  The  disease  association  of  proteasome- 
encoding  genes  and  CDC34  in  this  pathway  reinforces  this  interpretation:  IKK  undergoes  TCR 
signal-induced  polyubiquitination,  a  prerequisite  for  proteasomal  degradation  necessary  to 
produce  a  functional  NF-kB  molecule.  The  CDC34  gene  encodes  the  E2  enzyme  catalyzing 
ubiquitination  [50],  represented  by  rs4919910  located  22  kb  upstream  {pi  =  1.6xlO“"';S8  Fig). 
The  PSMB8/9  region  in  the  MHC  class  II  locus  had  19  SNPs  implicated  (top  association: 
rs241427,  p,  =  4.75x10”®^).  PSMD14  encodes  a  subunit  in  the  19S  regulatory  cap  complex  and 
had  moderate  levels  of  association  {pi  ~  10“^;  S8  Fig). 

Costimulation  by  the  CD28 family  contains  co-stimulatory  and  co-inhibitory  signaling  path¬ 
ways  that  reinforce  or  downregulate  TCR  signaling  [51]  and  contains  CTLA4  and  PTPNl  1  in 
addition  to  MHC  class  II.  CD28  co-receptors  bind  CD80/CD86  ligands  expressed  on  APC  sur¬ 
faces  and  upregulate  proximal  signaling.  CTLA4  competes  for  CD80/CD86  binding  and,  along 
with  PDl,  recruits  phosphatases  including  PTPNl  1,  which  dephosphorylates  key  signaling 
complexes  (Fig  4C).  The  class  I  antigen  presentation  and  cross-presentation  pathways  had  sig¬ 
nificantly  lower  AUCs.  The  MHC  class  I  pathways  can  affect  autoimmunity  via  the  CD8'^  T 
cell  selection  in  the  thymus  and  also  during  late-stage  pathogenesis  during  effector  T  cell  acti¬ 
vation,  which,  however,  would  involve  constitutive  proteasomes  rather  than  immunoprotea- 
somes.  CDC34  is  included  in  these  pathways  as  a  candidate  E2  enzyme  that  may  substitute 
UBE2A.  SOCSl  mainly  functions  as  a  negative  regulator  of  cytokine  signaling,  but  can  also 
participate  in  E2-mediated  ubiquitination  [52].  TRIM39  is  one  of  the  RING  domain-contain¬ 
ing  proteins  that  share  broad  activities  as  E3  ubiquitin  ligases  [53].  The  multiple  appearances 
of  these  proteasome-related  genes  in  different  pathways  suggest  that  proteasomes  play  key 
roles  in  the  control  of  autoimmunity.  Overall,  the  relatively  lower  association  strengths  of 
these  pathways  imply  that  autoimmune  risk  is  mainly  driven  by  CDd"^  T  cell  (Th  and  T^eg) 
selection  and  activation  involving  MHC  class  II  molecules. 

The  Cytokine  signaling  pathway  group  was  among  the  top-ranked  pathways,  primarily  due 
to  Interferon  gamma  signaling,  which  describes  the  classical  Janus  kinase  (JAK)-signal  trans¬ 
ducer  and  activator  of  transcription  (ST AT)  pathway  initiated  by  the  binding  of  IFN-y  to  its 
receptor  that  leads  to  the  expression  of  numerous  genes  [54],  including  MHC  class  I/II  and 
CIITA,  which  controls  class  II  expression  (Fig  4C).  The  MHC  class  I  and  II  genes  present 
together  in  this  pathway  as  downstream  targets  explain  its  high  AUC.  SOCSl, PTPNll,  and 
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PTPN2  [55]  encode  proteins  that  inhibit  JAKs  and  act  as  negative  regulators.  IFN-y  also 
induces  the  expression  of  a  large  number  of  TRIM  genes,  many  of  which  act  as  E3  ubiquitin 
ligases  for  proteasomal  degradation  [53].  Three  TRIM  proteins  (TRIM10/26/3I),  located  in 
the  MHC  region  (S8  Fig),  appeared  as  the  target  genes  of  this  pathway,  supporting  the  autoim¬ 
munity  association  of  proteasomes.  IL-2  signaling,  containing  the  key  disease-associated  genes 
IL2  and  IL2RA/IL2RB  (encoding  receptor  a  and  (3  chains),  had  AUC  =  0.62  in  TID.  These 
genes  accounted  for  10  SNPs  withp,-  <  10“^  (S8  Fig).  The  binding  of  IL-2  on  IL-2R  activates 
JAKl  and  JAK3  and  lead  to  multiple  signaling  pathways,  including  Ras-MAPK,  JAK-STAT, 
and  phosphoinositol  (PI)3-kinase  pathways.  The  Ras-MAPK  pathway  is  activated  by  the 
recruitment  of  SH2  containing  (SHC)  protein  by  the  IL-2/IL-2R  complex.  RAFl  and  BRAP 
negatively  regulate  Ras-MAPK  pathway,  the  latter  using  ubiquitination/proteasomal  degrada¬ 
tion.  BRAP  is  located  close  to  PTPNl  1  and  they  appeared  to  form  a  loci  of  association  together 
on  chromosome  12  (S8  Fig).  Among  many  other  input  signals  to  the  Ras-MAPK  pathway  is 
ERBB3,  one  of  the  established  TlD-associated  loci  [10].  ERBB3  forms  the  starting  point  of  the 
epidermal  growth  factor  receptor  kinase  signaling  involving  the  Ras-MAPK  pathway,  which 
can  influence  the  gene  expression  in  APCs  (Fig  4C). 

Association  of  pathway-implicated  genes  in  larger  data  sets 

We  compared  the  association  of  non-MHC  genetic  regions  contributing  to  pathways  in  the  RA 
data  (S8  Fig)  and  found  generally  weaker  association  than  in  TID,  except  in  PTPN22,  where 
the  RA  association  was  equally  strong  and  in  IL2RA  (rsl0795791  and  rs2104286)  and  IL2RB 
regions  (rs3218253  and  rs743777),  where  there  were  stronger  associations.  To  further  evaluate 
the  significance  of  these  genetic  regions,  we  compared  them  based  on  the  summary  statistics  of 
two  large-sample  TID  data  sets  by  Bradfield  et  al.  (BRF)  [8]  and  Barrett  et  al.  (BRT)  [10]  (S9 
Fig).  Some  of  these  genes  were  within/close  to  previously  identified  loci:  PTPN22,  CTLA4,  and 
IL2RA  are  among  the  major  robustly  identified  autoimmune  risk  loci  [6]  and  all  showed  higher 
degrees  of  association  in  the  BRF/BRT  data  sets.  In  addition,  we  noted  significant  groups  of 
SNPs  in  the  BRF  data  set  near  CD28  upstream  from  CTLA4;  in  TCR  signaling,  CD28  and 
CTLA4  each  act  as  stimulatory  and  inhibitory  co-receptors,  respectively  (Fig  4C).  PSMD14, 
which  had  a  number  of  SNPs  withp,  ~  10”^  in  TID,  was  surrounded  by  a  large  number  of 
SNPs  but  with  similarly  low  levels  of  significance  in  the  BFR/BFT  data  sets,  while  being  located 
close  to  the  previously  identified  IFIHl  locus  [6].  IL2  at  4q27  is  a  part  of  the  locus  centered 
around  KIAA1109  and  showed  increased  levels  of  significance  in  larger  data  sets.  ERBB3  and 
PTPN2,  also  identified  in  previous  studies  [6],  showed  increased  levels  of  significance  in  larger 
data  sets  (S9  Fig).  Along  with  ERBB3  at  12ql3,  BRAP  and  PTPN 1 1  at  12q24  form  the  two  main 
regions  of  the  chr-12  loci  (SI  Fig).  This  broad  region  of  association  had  increased  significance 
levels  in  the  larger  studies  with  the  BRAP-proximal  region  reaching  p;  ~  10“^°  in  the  BRT  data 
set.  IL2RB  is  within  the  Cl  QTNF6  locus  in  22ql2,  which  showed  a  moderate  increase  in  associ¬ 
ation  strength.  The  CLEC16A  locus  on  chromosome  16  contained  CIITA  and  SOCSl. 

The  ENAH,  RAFl ,  NFKBl,  and  CDC34-proximal  loci  are  regions  with  potential  association 
of  collective  origin;  when  we  compared  their  association  patterns  in  WTCCC  and  larger  data 
sets  (S8  and  S9  Figs),  they  either  showed  a  moderate  increase  in  significance  in  the  larger  stud¬ 
ies  or  instead  had  similar  p,  ranges  but  showed  more  weU-defmed  appearance  of  a  peak  cen¬ 
tered  around  the  genes. 

Discussion 

The  cell-type-specific  interactions  between  genetic  polymorphisms  we  characterized  illuminate 
two  different  aspects  of  autoimmune  disease  pathogenesis:  central  and  peripheral  tolerance. 
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The  central  tolerance  in  the  thymus  comprises  both  negative  selection  and  T^eg  development, 
whereas  peripheral  tolerance  involves  the  suppressive  action  of  thymic  T^eg  (tT^eg)  cells  origi¬ 
nating  from  the  thymus  as  well  as  the  peripherally  induced  T^eg  (pTreg)  cell  development  from 
naive  FOXPS'CDd"^  T  cells  [56].  The  strength  of  self-interactions  between  thymocyte  TCRs 
and  self-peptide-MHC  class  II  complexes  presented  by  APCs  (mTECs,  DCs,  and  thymic  B 
cells)  is  the  core  determinant  influencing  both  the  negative  selection  and  the  development  of 
tTjeg  cells:  intermediate  and  high  levels  of  avidities  likely  favor  tT^eg  differentiation  and  nega¬ 
tive  selection,  respectively  [22,  57]. 

mTECs  express  and  present  TRAs  induced  by  AIRE  or  transfer  them  to  DCs,  whose 
migratory  variants  can  also  bring  in  peripheral  antigens  for  presentation.  Recent  studies 
have  indicated  that  thymic  B  cells,  either  developed  within  or  having  immigrated  from  the 
periphery,  can  also  efficiently  present  self-peptides  to  immature  T  cells  [19,  20],  facilitating 
negative  selection  as  well  as  T^gg  cell  development  [58].  Yamano  et  al.  showed  that  this  con¬ 
version  of  peripheral  B  cells  requires  a  CD40  signal  and  that  thymic  B  cells  also  express  TRAs 
modulated  by  AIRE  [20].  Peripheral  B  cells,  while  generally  inefficient  as  APCs,  can  present 
BCR-recognized  peptides  and  activate  cognate  CDd"^  T  cells.  B  cells  have  been  found  to  pres¬ 
ent  pancreatic  P-cell  autoantibodies  to  T  cells  and  contribute  to  TID  [14].  We  found  pre¬ 
dominantly  strong  disease-associated  interactions  between  B  cells/DCs  and  T  cells,  whereas 
the  corresponding  interactions  were  minimal  for  mTECs  (Fig  4  and  Table  1),  which  support 
the  experimental  studies  suggesting  central  roles  played  by  thymic  and  peripheral  B  cells  as 
APCs.  It  has  been  suggested,  in  particular,  that  AIRE-induced  TRA  expression  profiles  in  B 
cells  and  mTECs  are  likely  to  be  distinct,  and  that  the  APC  function  of  thymic  B  cells  may 
serve  to  influence  CDd"^  T  cell  populations  such  that  BCR-induced  activation  of  autoimmu¬ 
nity  in  the  peripheral  tissues  could  be  minimized  [20] .  Our  results  imply  that  the  MHC  class 
II  polymorphisms  exert  their  dominant  effects  on  autoimmune  risk  primarily  via  their  pre¬ 
sentation  on  thymic  B  cells.  Impairment  of  this  B  cell-specific  negative  selection  and  T^eg 
development  are  consistent  with  the  widespread  detection  of  autoantibodies  in  late-stage 
TID  pathogenesis  [11]. 

Although  both  negative  selection  and  T^eg  development  result  from  TCR  signaling,  with 
their  respective  fates  likely  determined  by  relative  MHC-TCR  recognition  strengths,  there  are 
distinct  molecular  level  features  characterizing  T^eg  development,  notably  FOXP3  expression. 
In  particular,  among  the  possible  products  of  downstream  TCR  signaling,  evidence  indicates 
that  NF-kB  is  the  primary  transcription  factor  crucial  for  T^eg  cells:  mice  lacking  its  main  sig¬ 
naling  intermediates  (PKC9,  CARD  11,  and  IKK;  see  Fig  4C)  develop  severe  T^^g  deficiency 
[22].  Our  results  support  the  relative  importance  of  thymic  B  cell-Tjeg  interactions  during  T 
cell  maturation  (Table  1)  and  the  central  importance  of  NF-kB  signaling  (Fig  7). 

It  is  further  believed  that  T^eg  cell  differentiation  consists  of  two  distinct  steps:  the  first 
step  produces  a  FOXP3  CD25'^  T^eg  cell  precursor  as  a  result  of  MHC-TCR  interaction  and 
NF-kB  binding  to  aPOXPd  enhancer  element.  The  TCR-independent  second  step  requires 
the  binding  of  IL-2  (expressed  by  other  CDd"^  T  cells)  to  the  (TCR  signaling- induced)  IL2 
receptor,  which  produces  STAT5  that  stimulates  the  expression  of  FOXP3  (Fig  4C)  [22]. 
SNPs  outside  the  MHC  region  influencing  these  pathways  have  association  strengths  much 
weaker  than  MHC  loci.  Through  pathway-based  SNP  selection  combined  with  collective 
inference,  we  correctly  ranked  these  key  pathways  (Fig  7)  and  identified  relevant  genetic  fac¬ 
tors,  some  of  which  were  among  the  set  of  well-characterized  loci  {CTLA4,  PTPN22,  and 
IL2RA),  while  others  were  close  to  previously  identified  loci  {IL2,  CIITA/SOCSl,  and  IL2RB), 
helping  to  illuminate  their  biological  interpretations.  Some  were  those  with  levels  of  associa¬ 
tion  below  genome-wide  significance  (e.g.,  ENAH,  RAF1,NFKB1 ,  and  CDC34)  and,  individ¬ 
ually,  would  look  unremarkable  within  a  genome-wide  scan,  but  collectively  contributed  to 
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the  overall  disease  association  of  pathways  such  as  IL-2  signaling,  which  is  devoid  of  MHC 
SNPs. 

Our  analysis  also  pointed  to  two  factors  whose  roles  in  autoimmunity  have  been  suggested 
experimentally  but  their  genetic  contexts  have  not  been  clear.  One  is  BAG6,  whose  multiple 
appearances  in  B  cell  self-interactions  (Fig  6A)  and  B  cell-T  cell  interactions  (Fig  6B)  are  read¬ 
ily  explained  by  its  dual  roles  in  MHC  class  II  expression  in  APCs  and  the  negative  regulation 
of  HAVCR2  (and  FOXP3)  in  T  cells  (Fig  4C).  The  other,  BTNL2  expressed  on  intestinal  epi¬ 
thelial  cells,  explains  a  different  aspect  of  peripheral  tolerance  via  its  interaction  with  T^eg  cells 
(Fig  6C):  the  induction  and  maintenance  of  pT^eg  cell  populations  in  intestines  by  microbial 
and  self-antigens  [34].  pTreg  cells  appear  to  have  TCR  specificities  distinct  from  tTregs  and 
presumably  act  cooperatively  with  tTregs  to  suppress  self- reactive  effector  T  cell  activation 
[56].  Based  on  our  observation  of  interactions  involving  intestinal  epithelial  cells  in  Fig  4B,  we 
suggest  that  both  DCs  (by  presenting  antigens  derived  from  intestinal  cells)  and  B  cells  (by 
BCR-induced  antigen  presentation)  contribute  to  pT^eg  development  in  the  gut.  This  periph¬ 
eral  tolerance  component  involving  pTj-eg  cells  also  likely  explains  the  effect  of  environmental 
factors  on  TID  risk:  non-obese  diabetic  mice  are  less  prone  to  TID  when  exposed  to  microbes, 
and  the  human  disease  is  more  prevalent  in  industrialized  societies  with  limited  exposure  to 
bacteria  [11].  The  large  proportion  of  effector  B  and  T  cells  in  intestinal  tracts  also  implies  that 
the  activation  of  conventional  Th  cells  cognate  to  self-antigens  (presented  for  example  by 
migratory  DCs)  may  occur  in  or  near  intestinal  organs,  whose  action  toward  effector  cells  is 
suppressed  by  tTj-eg  and  pT^eg  cells. 

It  is  worth  emphasizing  that  the  main  results  discussed  above  arise  from  effects  beyond  sin- 
gle-SNP  association:  Fig  4A,  for  instance,  cannot  resolve  ceU-to-cell  interactions.  This  coopera¬ 
tive  nature  of  genetic  interactions  is  further  illustrated  by  BAG6  and  BTNL2,  whose  variants 
had  significantly  reduced  association  in  the  additive  component  ofp-values  (Figs  2  and  3) 
under  collective  inference,  featuring  strongly  instead  in  interaction  maps  (Fig  6).  This  finding 
is  consistent  with  the  expectation  that  because  SNPs  near  BAG6  affect  disease  by  modifying 
CIITA-based  regulation  of  MHC  class  II  antigen-presentation  pathway  [39]  (Fig  4C),  they  are 
dependent  on  HLA-DQBl  variants.  Likewise,  SNPs  in  the  BTNL2  region  presumably  affect  the 
affinity  of  the  ligand  expressed  on  the  surface  of  APCs  toward  a  receptor  yet  to  be  identified  on 
T  cells,  contributing  to  T^^g  cell  development  [34].  In  particular,  Fig  6C  suggests  that  in  addi¬ 
tion  to  negatively  regulating  HAVCR2  signaling  [41],  BAG6  may  also  act  on  the  signaling  of 
this  BTNL2  receptor  (Fig  4C). 

We  also  noted  interactions  involving  MICB  in  the  class  I  region  in  Fig  6.  MICB  and  neigh¬ 
boring  MICA  encode  molecules  highly  expressed  in  intestinal  epithelium  (rs2248459  and 
rs2248617  in  Fig  2)  and  are  recognized  by  the  similarly  localized  V8l  subset  of  y5  T  cells  [59]. 
In  this  context,  yS  T  cells  in  intestinal  tissues  help  eliminate  foreign  antigens  and  stressed  cells, 
which  may  also  contribute  to  the  control  of  autoimmunity. 

Our  computational  analysis  adopted  a  shift  in  spirit  from  other  studies  testing  for  possible 
interactions  between  SNPs:  instead  of  aiming  to  find  a  few  causal  interaction  pairs  among 
many  with  a  sufficiently  small  false  discovery  rate,  we  established  a  large  pool  of  interacting 
pairs  with  a  liberal  significance  threshold  and  derived  coarse-grained  features,  namely  the 
cell  type-specific  interaction  patterns  in  Fig  4B.  The  low-dimensional  nature  of  the  derived 
features  supports  such  analysis.  For  instance,  -100  potentially  significant  interactions  con¬ 
tributing  to  such  features  would  not  be  affected  significantly  even  when  false  positives 
among  them  number  more  than  -5.  We  anticipate  that  applications  of  similar  approaches  to 
other  disease  classes  based  on  large-scale,  meta-analysis  data  may  also  yield  useful  biological 
insights. 
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Methods 

Analysis  software  and  computations 

We  used  GeDI  (genotype  distribution-based  inference)  [27],  available  at  http://www.github. 
com/BHSAI/GeDI,  for  independent-SNP  analysis,  collective  inference,  cross-validation,  and 
interaction  p-value  calculations.  The  genotypic  model  (2  and  4  degrees  of  freedom  for  additive 
and  interaction  parts,  respectively)  was  used  for  aU  cases.  For  data  management  tasks  and  LD 
calculation,  we  used  PLINK  [60]  version  1.9.  We  verified  that  non-interacting  pi-values  from 
GeDI  were  similar  to  the  PLINK  outputs.  For  consistency  with  collective  inference  results,  we 
used  the  GeDI  results  for  the  special  cases  of  independent-SNPs. 

For  collective  inference,  we  used  the  pseudo-likelihood  method  with  I2  penalizer  X,  except 
for  pathway  scoring,  for  which  we  used  the  mean-field  option.  Computational  costs  for  the  for¬ 
mer  were  of  the  order  of  ~1  hr  for  a  single  collective  inference  involving  -100  SNPs  on  a  desk¬ 
top  machine  without  additional  tests.  We  performed  interaction  p-value  calculations  (based 
on  phenotype  label  permutation)  using  the  Department  of  Defense  High-Performance  Com¬ 
puting  resources:  for  m  =  100  SNPs,  this  calculation  would  consider  100x99/2  =  4,950  SNP 
pairs,  each  of  which  was  repeated  at  least  3,000  times  to  construct  the  null  distributions, 
amounting  to  -10^  h  if  run  on  a  single  desktop  computer. 

Genome-wide  data  processing 

We  obtained  the  WTCCC  case-control  data  for  TID  and  RA  and  built  the  case  and  control 
data  sets  excluding  individuals  as  specified  in  the  quality  control  report  of  the  original  study 
[6].  The  resulting  samples  contained  2,938  control  individuals  (shared),  1,963  TID  case  indi¬ 
viduals  and  1,860  RA  case  individuals.  With  the  quality-controlled  matrix-format  genotype 
data  provided,  we  first  performed  a  preliminary  association  analysis  and  identified  SNPs  with 
pi  <  10“^  for  each  disease.  We  then  visually  inspected  the  distribution  of  bi-allelic  raw  signals 
underlying  the  genotype  calls  for  each  SNP  and  excluded  variants  with  poor  clustering  as  pre¬ 
viously  described  [6].  We  further  excluded  isolated  SNPs  that  showed  strong  association 
strengths  without  neighboring  variants  and  obtained  458,934  and  458,511  SNPs  for  TID  and 
RA,  respectively.  AU  genomic  positions  are  with  respect  to  the  GRCh37  reference. 

Cross-validation 

We  performed  five-fold  cross-validation  using  the  genome-wide  data  under  varying  values  of 
an  I2  penalizer  X  by  first  selecting  a  cutoff and  selecting  SNPs  withp;  <Pc  using  the  training 
set  only.  The  number  of  SNPs  selected  in  this  way  varied  with  each  cross-validation  run 
because  the  training  sets  were  distinct.  For  each  selection,  we  inferred  both  additive  and  inter¬ 
action  parameters  involving  the  selected  SNPs  using  maximum  likelihood  and  predicted  dis¬ 
ease  phenotypes  for  the  test  set  based  on  the  disease  probability  given  by  Bayes’  theorem. 

These  predictions  from  aU  cross-validation  runs  were  combined  to  calculate  the  AUG.  These 
steps  were  repeated  for  different  X  values  andp^.  values.  We  calculated  the  mean  values  of  the 
number  of  SNPs  selected  for  each  choice  ofp^. 

Collective  inference 

For  TID  analysis,  we  selected  50  SNPs  from  chromosome  6  based  onp;  value  ranks,  and  10 
SNPs  each  from  PTPN22,  CTLA4,INS,  IL2RA,  and  chr-12  loci  to  form  the  m  =  100  SNP  set. 
Additive  and  interaction  parameters  for  this  SNP  selection  were  inferred  under  the  penalizer 
value  X  =  0.01  suggested  by  the  cross-validation  AUG  maximum.  Thep-values  associated  with 
the  additive  single-SNP  parameters  (Figs  2  and  3,  solid  bars,  middle)  were  obtained  by  the 
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asymptotic  null-distribution  approximation.  For  RA,  we  first  selected  255  SNPs  in  chromo¬ 
some  6  (MHC  region)  from  the  genome-wide  data  using  =  10“'*.  We  then  calculated  LD 
within  this  selection  of  SNPs  and  used  the  negative  logarithm  of  LD  f'  values  as  a  distance 
measure  to  cluster  them  (fc-means  clustering)  into  m  =  70  clusters,  each  with  high  LD  within 
the  group.  From  each  group,  we  then  selected  a  SNP  with  the  lowest  p,  value  to  form  the  final 
m-70  proxy  SNP  selection.  We  used  X  =  0.01  for  collective  inference  of  these  RA  data,  again 
based  on  the  cross-validation  AUC  maximum. 

Interaction  p-values  were  obtained  using  methods  as  described  [27].  Briefly,  we  first  calcu¬ 
lated  likelihood  ratio  statistics  corresponding  to  the  alternative  hypothesis  of  the  full  optimiza¬ 
tion  of  each  SNP  pair  interaction  versus  the  null  hypothesis  of  the  SNP  interaction  parameters 
being  identical  for  both  case  and  control  groups.  Null  distributions  of  these  statistics  for  each 
interaction  were  obtained  by  repeating  the  collective  inference  after  randomly  shuffling  phe¬ 
notype  labels.  This  sampling  was  repeated  at  least  5,000  times  for  TID  and  3,000  times  for 
RA.  As  sampling  depths  increased,  p-value  estimates  for  putatively  significant  pairs  tended 
to  increase.  The  final  sampling  depths  were  such  that  this  convergence  reached  reasonable 
saturation. 


Epigenetic  enrichment  analysis 

We  used  genome-wide  annotations  of  epigenetic  states  in  1 1 1  reference  epigenomes  based 
on  the  15-state  hidden  Markov  model  [32],  reducing  them  into  active  (transcribed:  TssA, 
TssAFlnk,  TxFlnk,  Tx,  TxWk;  enhancer:  Enh,  EnhG,  and  zinc  finger-associated:  ZNF,  Rpts) 
and  inactive  (heterochromatin:  Het,  bivalent:  TssBiv,  BivFlnk,EnhBiv,  repressed:  ReprPC, 
ReprPCWk;  quiescent:  Quies)  states.  We  obtained  the  frequency  of  active  states  in  each  epi- 
genome  by  calculating  the  fraction  of  genomic  segments  in  active  state.  For  the  proxy  SNP  sets 
of  each  diseases  (TID  and  RA),  we  constructed  the  group  of  all  known  SNPs  in  LD  with  each 
proxy  SNP  (r^  >  0.5)  from  1000  Genomes  Project  Phase  3  data  [30].  The  size  of  these  “LDed” 
SNP  groups  ranged  from  1  (proxy  alone)  to  -300  or  more  for  each  proxy.  We  calculated  the 
mean  active-state  frequencies  of  each  proxy  (Figs  2  and  3,  bottom)  in  different  epigenomes 
over  these  LDed  SNP  groups. 

The  enrichment  p-values  in  reference  epigenomes  on  the  single-SNP  level  (Fig  4A)  were 
obtained  by  comparing  these  frequencies  with  the  background  genome-wide  frequencies.  We 
used  binomial  tests  by  multiplying  the  frequencies  by  the  total  number  of  SNPs  (m  =  100  and 
70  for  TID  and  RA,  respectively)  with  fractional  numbers  rounded  up  to  integers.  For  the 
enrichment  of  interacting  SNP  pairs,  we  first  selected  SNP  pairs  with  interaction  p-values 
below  a  cutoff  (p  <  10“^).  We  then  considered  all  possible  combinations  of  two  epigenomes, 
and  counted  the  number  of  LDed  SNP  pairs  belonging  to  the  selected  proxy  SNP  pairs  where 
both  LDed  SNPs  were  active  in  each  respective  cell  types.  This  number  was  divided  by  the 
product  of  the  two  LDed  SNP  group  sizes  and  summed  over  the  proxy  set  to  obtain  the  effec¬ 
tive  number  of  active-state  pairs.  This  number  was  compared  using  binomial  tests  with  the 
background  value  estimated  by  the  product  of  active  state  frequencies  in  two  tissues  and  the 
total  number  of  significant  SNP  pairs. 

To  obtain  spatially  resolved  genetic  interaction  maps  (Fig  6),  we  divided  the  MHC  region 
(31-33  Mb  on  chromosome  6)  into  20  kb-sized  bins  and  built  two-dimensional  histograms  of 
the  LDed  SNP  pairs  belonging  to  each  grid  active  in  two  chosen  tissue  combinations,  normal¬ 
ized  by  the  product  of  LDed  group  sizes  such  that  the  numbers  would  represent  the  effective 
number  of  active  SNP  pairs.  To  estimate  the  spatially  resolved  enrichment  p-values,  we 
enlarged  the  grid  size  to  100  kb  such  that  an  appreciable  subset  of  grids  would  have  effective 
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SNP  pair  counts  larger  than  1.  We  then  compared  these  effective  numbers  in  each  grid  points 
with  the  epigenome-wide  average  using  binomial  tests. 

Collective  inference  with  pathway-based  SNP  selection 

We  downloaded  human  genomic  pathway  (1,705  in  total)  gene  lists  from  www.reactome.org 
on  April  19,  2016  and  made  a  list  of  all  non-provisional  genes  involved  in  the  pathways.  From 
all  quality- controlled  sets  of  SNPs,  those  within  the  distance  of  50  kb  from  the  coding  region 
were  assigned  to  each  gene.  We  then  represented  each  pathway  by  the  non-redundant  list  of 
SNPs  assigned  to  the  constituent  genes.  The  number  of  SNPs  selected  in  each  pathway  ranged 
up  to  -10,000  (S7  Fig).  For  each  pathway,  we  used  independent-SNP  and  collective  inferences 
(mean-field  approximation  [27])  to  find  AUC  scores.  For  largest  pathways  with  more  than 
-2,000  SNPs,  pi  value-based  filtering  was  applied  to  cross-validation  to  reduce  model  sizes.  We 
downloaded  the  summary  statistics  data  of  the  TID  meta-analyses  by  Bradfield  et  al.  [8]  and 
Barrett  et  al.  [10]  from  www.tldbase.org  to  generate  the  genetic  association  map  of  the  loci 
suggested  by  pathway  analysis  (S9  Fig). 

Supporting  Information 

51  Fig.  Coverage  of  SNPs  in  LD  with  TID  proxy  SNPs.  Positions  of  1000  Genomes  Project 
SNPs  with  LD  (r^  >  0.5)  to  m  =  100  TID  proxy  SNPs  (Fig  2)  are  shown  with  proximal  gene 
coding  regions  at  the  bottom.  Generated  with  the  University  of  California,  Santa  Cruz  (UCSC) 
Genome  Brower,  https;//genome. ucsc.edu. 

(PDF) 

52  Fig.  Detailed  view  of  the  SNP  coverage  in  LD  with  the  four  TID  proxy  SNPs  near  the 
PSMB8/9  genes.  See  SI  Fig. 

(PDF) 

53  Fig.  LD  pattern  between  TID  proxy  SNPs.  See  Fig  2  for  the  approximate  gene  annotations 
of  the  proxy  SNPs. 

(PDF) 

54  Fig.  Coverage  of  SNPs  in  LD  with  RA  proxy  SNPs.  Positions  of  1000  Genomes  Project 
SNPs  with  LD  (r^  >  0.5)  to  m  =  70  RA  proxy  SNPs  (Fig  3)  are  shown  with  proximal  gene  cod¬ 
ing  regions  at  the  bottom.  Generated  with  the  UCSC  Genome  Browser,  https://genome.ucsc. 
edu. 

(PDF) 

55  Fig.  Epigenetic  annotation  of  the  genomic  region  near  HLA-DQA2  and  HLA-DQB2 
genes.  The  map  was  generated  using  the  Roadmap  epigenome  browser  at  http:// 
epigenomegateway.wustl.edu/browser. 

(PDF) 

56  Fig.  Spatial  interaction  map  of  TlD-associated  interacting  SNPs  enriched  in  three  cell 
type  combinations. 

(PDF) 

57  Fig.  Distribution  of  pathway  association  scores  with  TID  and  RA  phenotypes.  Infer¬ 
ences  with  and  without  interaction  effects  are  shown  together  as  functions  of  the  number 
SNPs  in  each  pathway.  Pathways  containing  MHG  class  II  genes  are  shown  in  red.  Vertical 
lines  are  95%  c.i.  IL,  independent  loci  inference;  CL,  collective  loci  inference. 

(PDF) 
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58  Fig.  Independent-SNP  p-value  profiles  of  pathway-implicated  genetic  regions  for  TID/ 

RA.  See  Fig  4C.  Genes  shown  in  red  are  those  in  top-ranked  pathways  in  Fig  7,  whose  coding 
regions  are  shown  shaded  in  gray.  Other  genes  of  interest  nearby  are  indicated  with  non- 
shaded  coding  regions.  Darker  blue  (TID)  and  orange  (RA)  symbols  are  the  SNPs  directly 
included  in  the  pathways  (within  50  kb  of  coding  region  andp,-  <  10”^). 

(PDF) 

59  Fig.  Independent-SNP  p-value  profiles  of  pathway-implicated  genetic  regions  based  on 
large  TID  meta-analyses.  Data  are  from  summary  statistics  of  studies  by  Bradfield  et  al.  [8] 
and  Barrett  et  al.  [10].  See  S8  Fig  for  comparison. 

(PDF) 

SI  Table.  Inter-loci  interactions  between  TID  proxy  SNPs  with  lowp-values. 

(XLS) 

Acknowledgments 

This  work  made  use  of  data  generated  by  the  Wellcome  Trust  Case-Control  Consortium 
(WTCCC).  A  full  list  of  the  investigators  who  contributed  to  the  generation  of  the  data  is  avail¬ 
able  from  www.wtccc.org.uk.  We  thank  Joy  Hoffman  for  help  with  computational  resource 
management.  The  opinions  and  assertions  contained  herein  are  the  private  views  of  the 
authors  and  are  not  to  be  construed  as  official  or  as  reflecting  the  views  of  the  U.S.  Army  or  of 
the  U.S.  Department  of  Defense.  This  paper  has  been  approved  for  public  release  with  unlim¬ 
ited  distribution.  The  computations  were  performed  using  the  high-performance  computing 
resources  at  the  U.S.  Army  Engineer  Research  and  Development  Center  and  the  U.S.  Air 
Force  Research  Laboratory. 

Author  Contributions 

Conceptualization:  HJW  JR. 

Formal  analysis:  HJW. 

Funding  acquisition:  JR. 

Investigation:  HJW. 

Methodology:  HJW. 

Software:  HJW  CY. 

Supervision:  JR. 

Writing  -  original  draft:  HJW. 

Writing  -  review  &  editing:  HJW  CY  JR. 

References 

1 .  Parkes  M,  Cortes  A,  van  Heel  DA,  Brown  MA.  Genetic  insights  into  common  pathways  and  compiex 
reiationships  among  immune-mediated  diseases.  Nat  Rev  Genet.  2013;  14(9):661-73.  Epub  2013/08/ 
07.  doi:  10.1038/nrg3502  PMiD;  23917628 

2.  van  Beiie  TL,  Coppieters  KT,  von  Herrath  MG.  Type  1  diabetes:  etioiogy,  immunoiogy,  and  therapeutic 
strategies.  Physioi  Rev.  2011;  91(1):79-118.  Epub  201 1/01/21.  doi:  10.1152/physrev.00003.2010 
PMID:  21248163 

3.  Firestein  GS.  Evolving  concepts  of  rheumatoid  arthritis.  Nature.  2003;  423(6937):356-61 .  Epub  2003/ 
05/16.  doi:  10.1038/nature01661  PMID:  12748655 


PLOS  ONE  I  DOI:10.1371/journal.pone.0169918  January  12,  2017 


24/27 


t'PLOS 


ONE 


Genetic  Interaction  Effects  in  Autoimmunity 


4.  Andiauer  TF,  Buck  D,  Antony  G,  Bayas  A,  Bechmann  L,  Bertheie  A,  et  ai.  Novei  muitipie  scierosis  sus- 
ceptibiiity  ioci  impiicated  in  epigenetic  reguiation.  Sci  Adv.  2016;  2(6):e1501678.  Epub  201 6/07/08.  doi: 
10.1 126/sciadv.  1501 678  PMID:  27386562 

5.  Poiychronakos  C,  Li  Q.  Understanding  type  1  diabetes  through  genetics:  advances  and  prospects.  Nat 
Rev  Genet.  2011;  12(1 1):781-92.  Epub  201 1/10/19.  doi:  10.1038/nrg3069  PMID:  22005987 

6.  Wellcome  T rust  Case  Control  Consortium.  Genome-wide  association  study  of  1 4,000  cases  of  seven 
common  diseases  and  3,000  shared  controls.  Nature.  2007;  447{7145):661-78.  Epub  2007/06/08.  doi: 
10.1038/nature05911  PMID:  17554300 

7.  Viatte  S,  Plant  D,  Raychaudhuri  S.  Genetics  and  epigenetics  of  rheumatoid  arthritis.  Nat  Rev  Rheuma¬ 
tol.  2013;  9{3):141-53.  Epub  201 3/02/06.  doi:  10.1038/nrrheum.2012.237  PMID:  23381558 

8.  Bradfield  JP,  Qu  HQ,  Wang  K,  Zhang  H,  Sleiman  PM,  Kim  CE,  et  al.  A  genome-wide  meta-analysis  of 
six  type  1  diabetes  cohorts  identifies  multiple  associated  loci.  PLoS  Genet.  201 1 ;  7{9):e1 002293.  Epub 
201 1/10/08.  doi:  10.1371/journal.pgen. 1002293  PMID:  21980299 

9.  Farh  KK,  Marson  A,  Zhu  J,  Kleinewietfeld  M,  Housley  WJ,  Beik  S,  et  al.  Genetic  and  epigenetic  fine 
mapping  of  causal  autoimmune  disease  variants.  Nature.  201 5;  51 8(7539):337^3.  Epub  2014/1 1/05. 
doi:  10.1038/nature13835  PMID:  25363779 

10.  Barrett  JC,  Clayton  DG,  Concannon  P,  Akolkar  B,  Cooper  JD,  Erlich  HA,  et  al.  Genome-wide  associa¬ 
tion  study  and  meta-analysis  find  that  over  40  loci  affect  risk  of  type  1  diabetes.  Nat  Genet.  2009;  41 
(6):703-7.  Epub  2009/05/12.  doi:  10.1038/ng.381  PMID:  19430480 

1 1 .  Bluestone  JA,  Herold  K,  Eisenbarth  G.  Genetics,  pathogenesis  and  clinical  interventions  in  type  1  dia¬ 
betes.  Nature.  2010;  464(7293):  1293-300.  Epub  2010/05/01 .  PMID:  20432533 

12.  Plagnol  V,  Howson  JM,  Smyth  DJ,  Walker  N,  Hafler  JP,  Wallace  C,  et  al.  Genome-wide  association 
analysis  of  autoantibody  positivity  in  type  1  diabetes  cases.  PLoS  Genet.  201 1 ;  7(8):e1 00221 6.  Epub 
2011/08/11.  doi:  10.1371/journal.pgen. 1002216  PMID:  21829393 

13.  Serreze  DV,  Fleming  SA,  Chapman  HD,  Richard  SD,  Leiter  EH,  Tisch  RM.  B  lymphocytes  are  critical 
antigen-presenting  cells  for  the  initiation  of  T  cell-mediated  autoimmune  diabetes  in  nonobese  diabetic 
mice.  J  Immunol.  1998;  161(8):3912-8.  Epub  1998/10/21.  PMID:  9780157 

14.  O’Neill  SK,  Shlomchik  MJ,  Giant  TT,  Cao  Y,  Doodes  PD,  Finnegan  A.  Antigen-specific  B  cells  are 
required  as  APCs  and  autoantibody-producing  cells  for  induction  of  severe  autoimmune  arthritis.  J 
Immunol.  2005;  174(6):3781-8.  Epub  2005/03/08.  PMID:  15749919 

15.  Behrens  M,  Smart  M,  Luckey  D,  Luthra  H,  TanejaV.  To  Bor  not  to  B:  role  of  B  cells  in  pathogenesis  of 
arthritis  in  HLA  transgenic  mice.  J  Autoimmun.  2011;  37(2):95-103.  Epub  201 1/06/1 5.  doi:  10.1016/j. 
jaut.201 1 .05.002  PMID:  21665435 

16.  McPheeCG,  SprouleTJ,  Shin  DM,  Bubier  JA,  Schott  WH,  Steinbuck  MP,  et  al.  MHC  class  I  family  pro¬ 
teins  retard  systemic  lupus  erythematosus  autoimmunity  and  B  cell  lymphomagenesis.  J  Immunol. 
2011;  187(9):4695-704.  Epub  201 1/10/04.  doi:  10.4049/]immunol.1 101776  PMID:  21964024 

17.  Klein  L,  Kyewski  B,  Allen  PM,  Hogquist  KA.  Positive  and  negative  selection  of  the  T  cell  repertoire:  what 
thymocytes  see  (and  don’t  see).  Nat  Rev  Immunol.  2014;  14(6):377-91.  Epub  201 4/05/1 7.  doi:  10. 
1038/nri3667  PMID:  24830344 

18.  Sadegh-Nasseri  S,  Kim  A.  MHC  Class  II  Auto-Antigen  Presentation  is  Unconventional.  Frontiers  in 
immunology.  2015;  6:372.  Epub  2015/08/1 1 .  doi:  10.3389/fimmu.201 5.00372  PMID:  26257739 

1 9.  Perera  J,  Meng  L,  Meng  F,  Huang  H.  Autoreactive  thymic  B  cells  are  efficient  antigen-presenting  cells 
of  cognate  self-antigens  for  T  cell  negative  selection.  Proc  Natl  Acad  Sci  USA.  2013;  1 10(42):17011- 
6.  Epub  201 3/1 0/02.  doi:  1 0. 1 073/pnas.  1 31 3001 1 1 0  PMI D:  24082098 

20.  Yamano  T,  Nedjic  J,  Hinterberger  M,  Steinert  M,  Koser  S,  Pinto  S,  et  al.  Thymic  B  Cells  Are  Licensed  to 
Present  Self  Antigens  for  Central  T  Cell  Tolerance  Induction.  Immunity.  2015;  42(6):1048-61.  Epub 
2015/06/14.  doi:  1 0.1 01 6/].immuni.201 5.05.01 3  PMID:  26070482 

21 .  Hogquist  KA,  Jameson  SC.  The  self-obsession  of  T  cells:  how  TCR  signaling  thresholds  affect  fate 
’decisions’  and  effector  function.  Nat  Immunol.  2014;  15(9):815-23.  Epub  2014/08/20.  doi:  10.1038/ni. 
2938  PMID:  25137456 

22.  Hsieh  CS,  Lee  HM,  Lio  CW.  Selection  of  regulatory  T  cells  in  the  thymus.  Nat  Rev  Immunol.  2012;  12 
(3):157-67.  Epub  2012/02/1 1 .  doi:  10.1038/nri3155  PMID:  22322317 

23.  Liu  Z,  Gerner  MY,  Van  Panhuys  N,  Levine  AG,  Rudensky  AY,  Germain  RN.  Immune  homeostasis 
enforced  by  co-localized  effector  and  regulatory  T  cells.  Nature.  2015.  Epub  2015/11/26. 

24.  Liu  Y,  Maxwell  S,  Feng  T,  Zhu  X,  Elston  RC,  Koyuturk  M,  et  al.  Gene,  pathway  and  network  frameworks 
to  identify  epistatic  interactions  of  single  nucleotide  polymorphisms  derived  from  GWAS  data.  BMC 
Syst  Biol.  201 2;  6  SuppI  3:S1 5.  Epub  201 3/01/1 1 . 


PLOS  ONE  I  DOI:10.1371/journal.pone.0169918  January  12,  2017 


25/27 


ONE 


Genetic  Interaction  Effects  in  Autoimmunity 


I'PLOS 


25.  Lenz  TL,  Deutsch  AJ,  Han  B,  Hu  X,  Okada  Y,  Eyre  S,  et  ai.  Widespread  non-additive  and  interaction 
effects  within  HLA  ioci  moduiate  the  risk  of  autoimmune  diseases.  Nat  Genet.  2015;  47(9):1 085-90. 
Epub  201 5/08/11.  doi:  10.1038/ng.3379  PMID:  26258845 

26.  Hu  X,  Deutsch  AJ,  Lenz  TL,  Onengut-Gumuscu  S,  Han  B,  Chen  WM,  et  ai.  Additive  and  interaction 
effects  at  three  amino  acid  positions  in  HLA-DQ  and  HLA-DR  moiecuies  drive  type  1  diabetes  risk.  Nat 
Genet.  2015;  47(8):898-905.  Epub  2015/07/15.  doi:  1 0.1 038/ng.3353  PMiD:  26168013 

27.  Woo  HJ,  Yu  C,  Kumar  K,  Goid  B,  Reifman  J.  Genotype  distribution-based  inference  of  coiiective  effects 
in  genome-wide  association  studies:  insights  to  age-reiated  macuiar  degeneration  disease  mechanism. 
BMC  Genomics.  2016;  17:695.  Epub  201 6/09/01.  doi:  10.1 186/s1 2864-01 6-2871 -3  PMID:  27576376 

28.  Evans  DM,  Visscher  PM,  Wray  NR.  Harnessing  the  information  contained  within  genome-wide  associa¬ 
tion  studies  to  improve  individuai  prediction  of  compiex  disease  risk.  Hum  Moi  Genet.  2009;  1 8 
(18):3525-31.  Epub  2009/06/26.  doi:  10.1093/hmg/ddp295  PMID:  19553258 

29.  Wei  Z,  Wang  K,  Qu  HQ,  Zhang  H,  Bradfieid  J,  Kim  C,  et  ai.  From  disease  association  to  risk  assess¬ 
ment:  an  optimistic  view  from  genome-wide  association  studies  on  type  1  diabetes.  PLoS  Genet.  2009; 
5(1 0):e1 000678.  Epub  2009/1 0/10.  doi:  10.1371/journai.pgen.1000678  PMiD:  19816555 

30.  1000  Genomes  Project  Consortium,  Abecasis  GR,  Auton  A,  Brooks  LD,  DePristo  MA,  Durbin  RM,  et  ai. 
An  integrated  map  of  genetic  variation  from  1 ,092  human  genomes.  Nature.  2012;  491  (7422):56-65. 
Epub  2012/11/07.  doi:  10.1038/nature1 1632  PMiD:  23128226 

31 .  Neefjes  J,  Jongsma  ML,  Paui  P,  Bakke  O.  Towards  a  systems  understanding  of  MHC  ciass  i  and  MHC 
ciass  ii  antigen  presentation.  Nat  Rev  Immunoi.  2011;  1 1(12):823-36.  Epub  201 1/1 1/15.  doi:  10.1038/ 
nri3084  PMiD:  22076556 

32.  Roadmap  Epigenomics  Consortium,  Kundaje  A,  Meuieman  W,  Ernst  J,  Biienky  M,  Yen  A,  et  ai.  integra¬ 
tive  anaiysis  of  1 1 1  reference  human  epigenomes.  Nature.  2015;  518(7539):317-30.  Epub  201 5/02/20. 
doi:  10.1038/nature14248  PMiD:  25693563 

33.  Yang  S,  Fujikado  N,  Koiodin  D,  Benoist  C,  Mathis  D.  Reguiatory  T  ceiis  generated  eariy  in  iife  piay  a  dis¬ 
tinct  roie  in  maintaining  seif-toierance.  Science.  2015;  348(6234):589-94.  Epub  201 5/03/21.  doi:  10. 

1 126/science.aaa7017  PMiD:  25791085 

34.  Arnett  HA,  Viney  JL.  Immune  moduiation  by  butyrophiiins.  Nat  Rev  Immunol.  2014;  14(8):559-69.  Epub 
2014/07/26.  doi:  10.1038/nri3715  PMID:  25060581 

35.  Valentonyte  R,  Hampe  J,  Huse  K,  Rosenstiel  P,  Albrecht  M,  Stenzel  A,  et  ai.  Sarcoidosis  is  associated 
with  a  truncating  splice  site  mutation  in  BTNL2.  Nat  Genet.  2005;  37(4):357-64.  Epub  2005/03/01 .  doi: 
10.1038/ng1519PMID:  15735647 

36.  Narayan  K,  Su  KW,  Chou  CL,  Khoruzhenko  S,  Sadegh-Nasseri  S.  HLA-DM  mediates  peptide  exchange 
by  interacting  transiently  and  repeatedly  with  HLA-DR1.  Mol  Immunol.  2009;  46(1 5):31 57-62.  Epub 
2009/08/04.  doi:  10.1016/].molimm.2009.07.001  PMID:  19647320 

37.  Lenormand  C,  Bausinger  H,  Gross  F,  Signorino-Gelo  F,  Koch  S,  Peressin  M,  et  ai.  HLA-DQA2  and 
HLA-DQB2  genes  are  specifically  expressed  in  human  Langerhans  cells  and  encode  a  new  HLA  class 
II  molecule.  J  Immunol.  2012;  188{8):3903-11.  Epub  201 2/03/1 3.  doi:  10.4049/jimmunol.  11 03048 
PMID:  22407913 

38.  Binici  J,  Koch  J.  BAG-6,  ajackof  all  trades  in  health  and  disease.  Cell  Mol  Life  Sci.  2014;  71(10):1829- 
37.  Epub  201 3/1 2/07.  doi:  10.1007/s00018-013-1522-y  PMID:  24305946 

39.  Kamper  N,  Franken  S,  Temme  S,  Koch  S,  BieberT,  Koch  N.  gamma-Interferon-regulated  chaperone 
governs  human  lymphocyte  antigen  class  II  expression.  FASEB  J.  2012;  26(1  ):1 04-1 6.  Epub  2011/09/ 
24.  doi:  10.1096/fj. 11-189670  PMID:  21940994 

40.  Wherry  EJ,  Kurachi  M.  Molecular  and  cellular  insights  intoT  cell  exhaustion.  Nat  Rev  Immunol.  2015; 
15(8):486-99.  Epub  201 5/07/25.  doi:  10.1038/nri3862  PMID:  26205583 

41 .  Rangachari  M,  Zhu  C,  Sakuishi  K,  Xiao  S,  Karman  J,  Chen  A,  et  ai.  Bat3  promotes  T  cell  responses  and 
autoimmunity  by  repressing  Tim-3-mediated  cell  death  and  exhaustion.  Nat  Med.  2012;  18(9):1394- 
400.  Epub  2012/08/07.  doi:  10.1038/nm.2871  PMID:  22863785 

42.  Oh  H,  Ghosh  S.  NF-kappaB:  roles  and  regulation  in  different  CD4(-h)  T-cell  subsets.  Immunol  Rev. 

2013;  252(1):41-51.  Epub  2013/02/15.  doi:  10.1 1 1 1/imr.12033  PMID:  23405894 

43.  Zaiss  DM,  Bekker  CP,  Grone  A,  Lie  BA,  Sijts  AJ.  Proteasome  immunosubunits  protect  against  the 
development  of  CD8  T  cell-mediated  autoimmune  diseases.  J  Immunol.  2011;  187(5):2302-9.  Epub 
201 1/08/02.  doi:  10.4049/jimmunol.  11 01 003  PMID:  21804012 

44.  Croft  D,  Mundo  AF,  Haw  R,  Milacic  M,  Weiser  J,  Wu  G,  et  ai.  The  Reactome  pathway  knowledgebase. 
Nucleic  Acids  Res.  2014;  42(Database  issue):D472-7.  Epub  201 3/1 1/19.  doi:  10.1093/nar/gkt1 102 
PMID:  24243840 

45.  Wang  K,  Li  M,  Hakonarson  H.  Analysing  biological  pathways  in  genome-wide  association  studies.  Nat 
Rev  Genet.  2010;  11(1 2):843-54.  Epub  2010/1 1/19.  doi:  10.1038/nrg2884  PMID:  21085203 


PLOS  ONE  I  DOI:10.1371/journal.pone.0169918  January  12,  2017 


26/27 


)PLOS 


ONE 


Genetic  Interaction  Effects  in  Autoimmunity 


46.  Poiuektov  YO,  Kim  A,  Sadegh-Nasseri  S.  HLA-DO  and  its  Roie  in  MHC  Ciass  li  Antigen  Presentation. 
Frontiers  in  immunoiogy.  2013;  4:260.  Epub  201 3/09/07.  doi:  10.3389/fimmu. 201 3.00260  PMiD: 
24009612 

47.  Browniie  RJ,  Zamoyska  R.  T  ceii  receptor  signaiiing  networks:  branched,  diversified  and  bounded.  Nat 
Revlmmunoi.2013;  13(4):257-69.  Epub  2013/03/26.  doi:  10.1038/nri3403  PMiD:  23524462 

48.  Griffiths  EK,  Penninger  JM.  Communication  between  the  TCR  and  integrins:  roie  of  the  moiecuiar 
adapter  ADAP/Fyb/Siap.  CurrOpin  Immunoi.  2002;  14(3):317-22.  Epub  2002/04/26.  PMID:  1 1973129 

49.  Wang  H,  Rudd  CE.  SKAP-55,  SKAP-55-related  and  ADAP  adaptors  modulate  integrin-mediated 
immune-cell  adhesion.  Trends  Cell  Biol.  2008;  18(10):486-93.  Epub  2008/09/02.  doi:  10.1016/j.tcb. 
2008.07.005  PMID:  18760924 

50.  Wu  K,  Kovacev  J,  Pan  ZQ.  Priming  and  extending:  a  UbcH5/Cdc34  E2  handoff  mechanism  for  polyubi- 
quitination  on  a  SCF  substrate.  Mol  Cell.  2010;  37{6):784-96.  Epub  2010/03/30.  doi:  10.1016/j.molcel. 
2010.02.025  PMID:  20347421 

51 .  Chen  L,  Flies  DB.  Molecular  mechanisms  of  T  cell  co-stimulation  and  co-inhibition.  Nat  Rev  Immunol. 
2013;  13(4):227-^2.  Epub  2013/03/09.  doi:  10.1038/nri3405  PMID:  23470321 

52.  Kile  BT,  Schulman  BA,  Alexander  WS,  Nicola  NA,  Martin  HM,  Hilton  DJ.  The  SOCS  box:  a  tale  of 
destruction  and  degradation.  Trends  Biochem  Sci.  2002;  27(5):235^1 .  Epub  2002/06/22.  PMID: 
12076535 

53.  Deshaies  RJ,  Joazeiro  CA.  RING  domain  E3  ubiquitin  ligases.  Annu  Rev  Biochem.  2009;  78:399^34. 
Epub  2009/06/06.  doi:  10.1 146/annurev.biochem.78.101807.093809  PMID:  19489725 

54.  Platanias  LC.  Mechanisms  of  type-1- and  type-1  l-interferon-mediated  signalling.  Nat  Rev  Immunol. 

2005;  5(5):375-86.  Epub  2005/05/03.  doi:  10.1038/nri1604  PMID:  15864272 

55.  Bourdeau  A,  Dube  N,  T remblay  ML.  Cytoplasmic  protein  tyrosine  phosphatases,  regulation  and  func¬ 
tion:  the  roles  of  PTP1 B  and  TC-PTP.  Curr  Opin  Cell  Biol.  2005;  17(2):203-9.  Epub  2005/03/23.  doi: 
10.1016/j.ceb.2005.02.001  PMID:  15780598 

56.  Yadav  M,  Stephan  S,  Bluestone  JA.  Peripherally  Induced  Tregs — Role  in  Immune  Homeostasis  and 
Autoimmunity.  Frontiers  in  immunology.  2013;  4. 

57.  Malhotra  D,  Linehan  JL,  Dileepan  T,  Lee  YJ,  Purtha  WE,  Lu  JV,  et  al.  Tolerance  is  established  in  poly¬ 
clonal  CD4(-h)  T  cells  by  distinct  mechanisms,  according  to  self-peptide  expression  patterns.  Nat  Immu¬ 
nol.  2016;  17(2):187-95.  Epub  2016/01/05.  doi:  10.1038/ni.3327  PMID:  26726812 

58.  Lu  FT,  Yang  W,  Wang  YH,  Ma  HD,  Tang  W,  Yang  JB,  etal.  Thymic  B  cells  promote  thymus-derived  reg¬ 
ulatory  T  cell  development  and  proliferation.  J  Autoimmun.  2015;  61:62-72.  Epub  201 5/06/1 5.  doi:  10. 

1 01 6/j.jaut.201 5.05.008  PMID:  26071985 

59.  Thedrez  A,  Sabourin  C,  Gertner  J,  Devilder  MC,  Allain-Maillet  S,  Fournie  JJ,  et  al.  Self/non-self  discrimi¬ 
nation  by  human  gammadelta  T  cells:  simple  solutions  for  a  complex  issue?  Immunol  Rev.  2007; 
215:123-35.  Epub  2007/02/13.  doi:  10.1 1 1 1/j.1600-065X.2006.00468.x  PMID:  17291284 

60.  Purcell  S,  Neale  B,  Todd-Brown  K,  Thomas  L,  Ferreira  MA,  Bender  D,  et  al.  PLINK:  a  tool  set  for  whole- 
genome  association  and  population-based  linkage  analyses.  Am  J  Hum  Genet.  2007;  81  (3):559-75. 
Epub  2007/08/19.  doi:  10.1086/519795  PMID:  17701901 


PLOS  ONE  I  DOI:10.1371/journal.pone.0169918  January  12,  2017 


27/27 


